D56261GC10 sg3 PDF
D56261GC10 sg3 PDF
D56261GC10 sg3 PDF
D56261GC10
Edition 1.0
July 2009
D61351
Author Copyright © 2009, Oracle. All rights reserved.
Editors
Nita Pavitran
Raj Kumar
Graphic Designer
Rajiv Chandrabhanu
Publishers
Sujatha Nagendra
Veena Narasimhan
Michael Sebastian Almeida
Case Study
Contents
RISD_Project_Plan_V1.3
RISD_Project_Plan_V1.3
Tasks
M1_RISD_Baseline_Project_Plan_ks.htm (1 of 16)
RISD_Project_Plan_V1.3
M1_RISD_Baseline_Project_Plan_ks.htm (2 of 16)
RISD_Project_Plan_V1.3
Requirements Definition
41 9 days Fri 2/13/04 Wed 2/25/04 8%
Process
Develop MoSCoW list
42 Report Requirements 4 days Fri 2/13/04 Wed 2/18/04 8%
[State = Refine]
Assessment Team OPM,Director,Arch,
43 0.25 days Mon 2/16/04 Mon 2/16/04 100%
Kickoff District
Functional reports Team[50%],OPM
44 1 day Wed 2/18/04 Wed 2/18/04 0%
team work session [50%],Arch[50%]
OPM[25%],RPM
Initial Data Model [25%],District[25%],
45 2 days Fri 2/13/04 Mon 2/16/04 0%
Review Director[25%],Arch
[25%]
Obtain Existing
M1_RISD_Baseline_Project_Plan_ks.htm (3 of 16)
RISD_Project_Plan_V1.3
Define Security
60 1 day Mon 3/1/04 Tue 3/2/04 Director 0%
Requirements (Initial)
Technical Architecture
61 7 days Mon 2/23/04 Tue 3/2/04 0%
Process
Define Architecture
62 2 days Mon 2/23/04 Tue 2/24/04 OPM[50%] 0%
[State = Refine]
Define Capacity Plan
63 2 days Wed 2/25/04 Thu 2/26/04 Arch[25%] 0%
[State=Initial]
Construct Development
64 Environments [State = 2 days Thu 2/26/04 Fri 2/27/04 District 0%
Initial]
Define Detailed System
65 Operational 1 day Fri 2/27/04 Fri 2/27/04 OPM 0%
Requirements
M1_RISD_Baseline_Project_Plan_ks.htm (4 of 16)
RISD_Project_Plan_V1.3
Create Reporting
82 30 days Mon 2/23/04 Fri 4/2/04 OPM[25%] 0%
Requirements Document
Signoff Reporting
83 3 days Mon 4/5/04 Wed 4/7/04 RPM 0%
Requirements Document
Requirements Analysis
84 5 days Mon 3/1/04 Fri 3/5/04 0%
Process
Construct Business Data
85 2 days Mon 3/1/04 Tue 3/2/04 Arch[25%] 0%
Model [State = Final]
Construct Preliminary
86 1 day Wed 3/3/04 Wed 3/3/04 Arch[50%] 0%
Database Object
Conduct Data Quality
87 2 days Wed 3/3/04 Fri 3/5/04 District,Director[0%] 0%
Assessment
88 Data Acquisition Process 59 days Fri 3/5/04 Thu 5/27/04 0%
M1_RISD_Baseline_Project_Plan_ks.htm (5 of 16)
RISD_Project_Plan_V1.3
M1_RISD_Baseline_Project_Plan_ks.htm (6 of 16)
RISD_Project_Plan_V1.3
M1_RISD_Baseline_Project_Plan_ks.htm (7 of 16)
RISD_Project_Plan_V1.3
M1_RISD_Baseline_Project_Plan_ks.htm (8 of 16)
RISD_Project_Plan_V1.3
M1_RISD_Baseline_Project_Plan_ks.htm (9 of 16)
RISD_Project_Plan_V1.3
Assignments
Task ID Task Name Resource Name Work Start Finish % Work Complete
4 Obtain Board Approval District 8 hrs Mon 1/19/04 Mon 1/19/04 100%
5 Sign Contract District 112 hrs Tue 1/20/04 Fri 2/6/04 100%
5 Sign Contract Oracle 112 hrs Tue 1/20/04 Fri 2/6/04 100%
6 Obtain Development/Test District 80 hrs Mon 2/9/04 Fri 2/20/04 75%
7 Set up Dev/Test Environment District 24 hrs Mon 2/23/04 Wed 2/25/04 0%
Obtain and setup Production
8 District 600 hrs Tue 1/20/04 Mon 5/3/04 0%
Environment
Deploy Development
48 OPM 2 hrs Thu 2/26/04 Thu 2/26/04 0%
Standards and Guidelines
Deploy Development
48 RPM 2 hrs Thu 2/26/04 Thu 2/26/04 0%
Standards and Guidelines
Construct Business Data
49 Arch 8 hrs Wed 2/18/04 Wed 2/18/04 0%
Model [State = Revise]
Define Solution Integration
50 OPM 4 hrs Thu 2/19/04 Fri 2/20/04 0%
Functional Architecture
Obtain and Review Campus
51 Director 8 hrs Wed 2/18/04 Wed 2/18/04 0%
EAI templates
Define Solution integration
52 OPM 4 hrs Thu 2/19/04 Fri 2/20/04 0%
technical Architecture
Define Data Quality Approach
53 OPM 2 hrs Mon 2/23/04 Mon 2/23/04 0%
[State=Final]
M1_RISD_PMP_V1.doc
Approvals:
RISD: _____________________________________________
Oracle: _____________________________________________
Project Management Plan Doc Ref:RISD/DW/004
Document Control
Change Record
4
Name Position
- ii -
Distribution
Contents
Introduction
On January 2000, the RISD and Oracle representatives met to discuss the details of the
Data Warehouse project. The board approved the proposal and the contract was signed.
The contract reflects the scope agreed upon in the Data Warehouse proposal.
Purpose
The purpose of this Project Management Plan (PMP) is to confirm the scope of work to
be performed as agreed upon in the Oracle Proposal document and the corresponding
Fixed Price contract. It also defines the approach to project management and quality
control that will be applied to the Data Warehouse, PHASE I project for the Roy
Independent School District (RISD). This document is intended to supplement the original
statement of work in the proposal and the contract by providing additional details
RIDS Data Warehouse Page 1
Project Management Plan Doc Ref:RISD/DW/004
regarding project work products and milestone completion criteria. A detailed baseline
work plan accompanies this project management plan document. The high-level project
plan is attached as Appendix A.
Background
The Data Warehouse environment will provide decision makers throughout the District
with information to help improve student achievement. Users will access the system via
Windows based Internet-capable computers accommodating the needs of both computer
novices and experts. Discoverer report data will be available online and can be
downloaded into local applications where appropriate (for example, spreadsheets and
PC databases) to perform additional analysis or for integration with local data.
This project will utilize Oracle Data Warehousing Methodology (DWM) and Project
Management Methodology (PJM). These are discussed more fully later in the Project
Methods section.
• Scope
• Objectives
• Approach
• Project Tasks, Work Products, and Milestones
Standards and procedures will be followed in the following areas:
• Control and Reporting (issue management, scope change control, progress
reporting, and so on)
• Work Management (management of tasks and project budget information)
• Resource Management (management of staff including Roles and Responsibilities,
as well as physical resources)
• Quality Management (reviews of Work Products, testing, and so on)
• Configuration Management (how project intellectual capital and software are to be
managed)
The Project Management Plan (PMP) defines the implementation approach needed to
meet the defined objectives.
Related Documents
• RISD DW baseline Project Plan
• Proposal
• Fixed Price Engagement Contract Order Form
Scope
Scope of Project
Oracle shall perform the planning, design, development and deployment of the RISD
Data Warehouse Phase I to provide decision makers throughout the District with
information to assist in improving student achievement.
Oracle’s approach includes shifting responsibility for the technical environment to the
RISD DBA and System Administrator during the Design and Build Phase of the project.
This approach coincides with Oracle’s objective to enable the RISD staff to administer
the technical environment by project completion with minimal assistance from Oracle.
RISD should attempt to use the same versions of software across applications where
possible. For example, the Operational Data Store (ODS) developed by RISD staff and
the Data Warehouse developed primarily by Oracle staff should use the same Oracle
10g version of the database. If Oracle Warehouse Builder (OWB) is used for the ETL
process and the design of the ODS, the DW and ODS environment should use the same
versions.
Data Quality
RISD will be responsible for data cleansing before data is extracted from the ODS to load
the data warehouse. Data elements not passing simple edits will be set to unknown in the
DW ETL process in order to avoid dropping records during the database load. Errors
such as correcting duplicate Student records or merging student records after the load,
will be the responsibility of the district. However, the DW ETL correction process must
have the ability to correct student records. Oracle will accept preliminary assessments
data extract and at some point, accept the final assessment data extract. Decisions on
whether the DW pulls data from the ODS or the ODS pushes data from the DW will be
determined during detail design. The ETL process will be designed with the
capability to delete a load and reload a replacement file with corrected data
provided by RISD.
Data Acquisition
All data will be stored in the RISD ODS before being extracted and loaded into the
Data Warehouse. Extract programs and testing to obtain source data for loading into the
ODS are the responsibility of RISD. Oracle will design, develop, and test software to
populate the Data Warehouse target tables and materialized views (The final design will
determine if the ODS will ‘Push’ data to the DW or if the DW will ‘Pull” data from the
ODS. Business rules for ensuring records have a valid student identifier will be a
district responsibility. Records that fail the edit will be assigned a temporary unique
student identifier in the ODS so that the data can be loaded into the DW to ensure
summary reports are correct. Student identifiers will be corrected as quickly as possible
and sent to the data warehouse as part of a full replacement file. Simple business rules
such as editing for a numeric field will be provided to the ETL process by Roy business
analysts. As mentioned, data not passing these simple edits will be defaulted to
unknown. Business rules to define metrics for inclusion in the Data Warehouse will be
defined by RISD business analysts but Oracle Consulting will be responsible for creating
the derived fields within the Data Warehouse. In some cases, metrics can be created
within the report. OWB will be used to document these business rules.
Data Conversion
Conversion and update processes will use the same software. RISD will ensure all
conversion history will be available in the ODS environment. That is, RISD must obtain
data for prior years, for the initial load process. After conversion, RISD may choose to
delete older historical data in the staging area, ODS, or both.
R10, RIMS, PEIMS information will be updated monthly in the Operational Data Store
(ODS) staging area. Data needed to update the Data Warehouse will be accessed via a
single database link to the ODS. Please note that there is overlapping information in
some of the source systems that may be out-of-sync in the source systems. The intent of
the ODS will be to develop business rules to have a single source for every target
element in the Data Warehouse. Since R10 is the official source for state reporting, it is
likely that R10 will be considered the system of record for overlapping elements.
However, detailed business rules will be developed during the design of the ETL process
for the ODS and it may be necessary to carry multiple elements depending on the
reporting needs. The goal of the DW is to have a “single version of the truth” for district
student profile information as much as possible. The business rule to use the student
demographics associated with each assessment will cause the student’s demographic
information to vary in some cases. For example, the TAXS report would show the student
as grade 2 and the SDAA report would show the student as grade 1 at the same point in
time. The only way to avoid this would be to use just a single source for student
demographics, like the SIS.
Assessment Information
The ETL process can accept data in flat file formats or simple tables from the staging
area but the final design expects to obtain all Phase 1 source data from the operational
data store. Business rules can be applied to cleanse key identifiers before the data is
made available to the Data Warehouse. Please note that student data provided by the
assessment office must be maintained in addition to the SIS Student data.
Student profile data provided by the assessment office will be used for state reporting
and in other cases when cross assessment test reporting are required, SIS student
profile data may be used.
Since the actual dates data available for processing will vary, all update process will need
to be scheduled. It is likely that the DW will run an automated CRON job to determine if
assessment data is available. Dependencies will be noted in the documentation process.
For example, all students on an assessment file must already have a matching student
record. As noted earlier, a process to create a temporary student identifier is needed to
ensure no assessment records are dropped. The DW ETL business rule will be to add
the record unless the student ID is invalid. Duplicate student IDs for the same subject for
the same test will be added through sequence loading.
Annual flat files will be provided plus periodic supplementary data files for new students
RISD will provide a full Location hierarchy on a monthly basis when the SIS data is
updated. Detail layouts will be created during the design phase.
District – District Administrators
Area – Area Administrators
Location Type (HS, JHS, Elementary)
Campus (school) – Principal and School administrators
Portal Obligations
The third party vendor "Campus EAI" will install Oracle 10g AS Version 10.1.2 or higher
and set up the basic look and feel, graphics, template pages, portlets, RISD content, and
users and roles needed for the Data Warehouse. Therefore, the portal design is
Aggregate reports may link to one or more detail reports. In some cases, reports will
have the ability to “drill” to lower levels in a hierarchy. For example, a report may begin
with a district summary and drill though the hierarchy down to a specific school.
As noted earlier, some aggregate reports may need the ability to view a breakout of
report data using various dimensions. Sample dimensions are listed below; actual
dimensions will be defined during the requirements process.
• Location Hierarchy
∗ District Total
∗ Area
RIDS Data Warehouse Page 8
Project Management Plan Doc Ref:RISD/DW/004
∗ Campus Type
∗ Campus (School)
• Gender
• Ethnicity
• Language Classification
∗ Home Language
• Poverty Indicator
∗ Meal Program (Free, Reduced)
The Data Warehouse will need to capture and display data as reported by the
assessment office to the state as well as reported by the internal RISD student SIS
system. This will be done in separate reports for simplicity of understanding.
1. School Board (Includes School Board & Public) – Achievement Aggregate Data (no
individual students identified). No sign-in is necessary.
2. Central Staff (Includes Superintendent, Central Administrators, Researchers) –
Superintendent, Central Administrators, and Researchers can see all aggregate and
detail student data.
3. Area Staff (Includes Area Superintendent & Area Administrators) – Area
Superintendent and Administrators can only see the current students of the area.
4. Campus Staff (Includes Principals & Campus Administrators) – Principal & Campus
Administrator can only see the current students of the school
5. Teachers - Teacher can only see his or her current students.
6. Students (Includes Parents & Students) – Student can only see his or her
information, Parent can see only his or her students.
RISD will provide interface files with the Employee to location relationships and the
employee to student relationship for teachers (course enrollment data). RISD will add
new users to the OID directory for the portal that drives the Data Warehouse and send
security relationship data to the Data Warehouse. The format of these interface files will
be developed during the security design phase after the security requirements are fully
documented. In addition, a simple Oracle Form will be created to allow a RISD
administrator to add a user to an existing security group (listed above). For example, for
small user groups such as central staff with ad hoc capabilities, it may be easier to add
and delete users online rather than develop an interface. The creation of new user
groups is out of scope for this project.
Oracle is responsible for ensuring the update process loads data from the source files as
per the documented update specifications and properly applies business rules for
transformations and the creation of standard metrics.
Oracle is responsible for ensuring the 8 unique and 2 dashboard duplicate reports they
Data load performance testing will test the ability of the system to handle the data load
and update volumes. Since the updates are no more frequent than monthly and all data
files are not available at exactly the same time, performance loading should not be of
consequence as it can be scheduled on off hours and weekends during an update
window. Except for the possible exception of the initial conversion, normal update cycles
will fit into an overnight update window.
Report generation performance has two pieces. Static reports that are created once and
viewed by many will be created in “batch” immediately after the data loads and published.
For example, static aggregate metrics that would be posted on the dashboard page
would probably be created in advance. Some reports that allow parameters to be entered
or reporting allowing drill down on dimensions will be run in real time. These
requirements will be documented when the report specifications are gathered. Response
time will be dependent on the number of users simultaneously accessing the system to
create reports.
ETL and report acceptance criteria will be documented in the ETL and Report Test plans.
• RISD will provide Oracle with a stable technical environment and ensure that
hardware and operating system environments are available when required and in
working order.
• The single development/test HP machine will have sufficient storage to support
two full copies of the initial conversion data that will in unit and system testing.
Usable space is estimated at 500 GB for the development/test environment.
Please note, if mirroring is used, 1 TB is needed to obtain 500 GB of usable
storage. At the end of the project Oracle Consulting will clone the initial production
environment to create the development/unit test DW and the Test DW.
• The production machines will be available by May 1, 2000 or sooner.
Scope Control
Change control will be managed through the Change Control procedure defined in the
proposal. The form is in Appendix C of this document.
Objectives
District Goal
The goal of the Data Warehouse is to supply useful consistent information to decision
makers at various levels of the District via a Web-based interface in order to work
towards improving student achievement, projecting trends in student achievement, and
implement or refine instructional interventions and programs in a timely manner.
Project Objectives
The objectives of the project are to:
Approach
The approach includes the following main areas:
• Project Methods
• Project Work Products Completion Criteria
• Plans
• Acceptance
• Project Administration
Oracle will use Oracle Designer and Oracle Warehouse Builder (OWB) to design the
process. The logical data model will be created in Designer using Oracle’s experience
with K-12 data models to create the initial design. Similarly, Oracle will use its K-12
experience to create the initial OWB Meta Data Layer (MDL). Designer information can
be imported into OWB for full documentation. Naturally, the data model and metadata will
be adjusted to support Roy unique requirements.
Work Management
Resource Management
Quality Management
Configuration Management
PJM tasks are organized into processes that help project management understand the
tasks to be performed for a successful project. The PJM processes are as follows:
The Control and Reporting process determines the scope and approach of the project,
manages change, and controls risks. It contains guides for reporting progress status
externally and for controlling the Quality Plan.
The Work Management process defines, monitors, and directs the work performed on
the project. It also maintains the financial view of the project for Oracle management.
The Resource Management process determines the right level of staffing and skills for
the project, and the working environment to support them.
The Quality Management process implements quality measures, so that the project
responds to the District’s expectations throughout the project life cycle.
The Configuration Management process stores, organizes, tracks, and controls the
items produced from the project. It also provides a single location from which the project
work products are published.
Acceptance
All Work Product project documentation will be submitted not more than twice and the
review must be completed within a total of three business days. Draft Work Products will
be available for review before the “final” Work Product is presented. The RISD project
RIDS Data Warehouse Page 15
Project Management Plan Doc Ref:RISD/DW/004
manager will review all Work Products within three (3) working days from delivery and
return one consolidated set of comments to Oracle. Upon receipt of comments from
RISD, Oracle will resubmit the Work Product in final form after incorporating the
applicable and necessary changes. When the final form is submitted, acceptable will be
based on the comments previously raised, no new issues can be raised. Following
Oracle’s delivery of the final version of documentation, assuming proper incorporation of
team member comments, RISD or its representative will have three (3) working days
from delivery in which to provide Oracle with written notification of its acceptance of the
Work Product. RISD written notification will normally take the form of an executed copy of
Oracle’s standard Certificate of Acceptance. In the absence of RISD’s specific
acceptance or rejection of these Work Products at the end of the three (3) working day
acceptance periods (draft and final, respectively), the Work Product and applicable
services will be deemed to have been approved and accepted. However, the intent is to
obtain Roy’s comments and feedback for work in progress to minimize the time for
“official” review and acceptance.
• RISD has provided a workroom with 4 terminals and two work areas for team
meetings.
• All draft and final work products, and DSS related documents, will be stored on the
project server provided and backed up by RISD.
• Oracle will honor all holidays recognized by Oracle or RISD, although with
permission, Oracle may choose to work on holidays and weekends.
• Oracle may work off site during extended Roy vacations, such as the Spring
Break, March 4 – March 11.
• There is a two to three week break at the end of June to mid-July where most staff
is not available and must be factored into the schedule
• There should be no Friday meetings scheduled for June as many staff members
are on a 4-day, 10-hour schedule.
• School begins 8/17 and there will be limited access to Roy personnel between
8/10 and 8/24.
Issue Management
Issues and action items will be tracked using an Excel spreadsheet maintained by the
Oracle project manager. Roy’s internal tracking system may be used in at some point
later in the project.
Status Description
will ensure action item commitment dates are met and unresolved items will be brought
to the steering committee for resolutions. Action items will be tracked in the RISD
tracking system.
Risk Management
A Risk Management Procedure will be implemented by tracking risks in an Excel
spreadsheet. Risks will be reviewed periodically at weekly progress meetings. Any team
member can identify a risk to either the Roy project manager or the Oracle project
manager. For new risks, the project managers will make a joint decision on whether or
not a special meeting is required or if the risk can be discussed at the next scheduled
status meeting. The Oracle project manager will maintain the risks spreadsheet.
Change Management
To complete the project on time and on budget it, is imperative to manage to the original
scope of the project. The Change Request form will be used to propose a change to
initial activities or to add project activities. The forms will be filled out even if the change
or addition has no impact on schedule or budget. Naturally, change requests having an
impact on schedule or budget need to be addressed and approved or rejected quickly.
Any project team member that may initiate the Change Order process. Initially, a Change
Request Form, Attachment C, will be prepared and provided to the Project Managers.
The Oracle and RISD Project Managers will review all open requests weekly. When a
Change Order that effects schedule or budget is necessary, the Project Managers will
present their findings to the Project Sponsors. Approved budget changes will require a
change order.
meetings within this plan. Team Progress Reviews will be held on a weekly basis to
assess the progress of each team member and to plan for the following week(s).
Progress Reporting
Oracle shall provide a written weekly status report:
Project to date summary and status (green, yellow, red)
Reporting week accomplishments and work activities
Milestone Status and Targets
Planned activities for the next period
Open Issues
Open TARS
Administrative Plans
Work Management
The Work Management process is responsible for defining, monitoring, and directing all
work performed on the project. In addition, it must maintain a financial view of the project
for management review.
The Work Plan for this project is maintained in Microsoft Project. The project plan will
supplement this document.
Unplanned activities will be tracked by each team member and documented in the
weekly status reports to the Oracle Project Manager. The Oracle Project Manager will
work with the RISD Project Manager to determine impact to the project.
Resource Management
The Resource Management process provides the project with the right level of staffing
and skills and the right environments to support them. It does not cover purchasing,
recruiting, and accounting procedures; these are practice management issues and are
therefore outside the scope of this Project Management Plan. Oracle Consulting will bring
resources to the project as appropriate to meet project schedules. At times it may be
necessary to “crash” the schedule by bringing on additional resources and at other times
to reassign resources to other projects in order to manage the budget.
Staff Resources
Project Team
Configuration Management
Three types of procedures will be developed: Document Control, Configuration Control
and Release Management.
This section describes the directory structure and versioning scheme for development,
test, and production environments.
Working code is stored in the repository of the developer's PC, but all developers will
back up their code once a day
Objects Description
Document Control
All documents produced as project work products will be subject to Version Control.
Release Management
Over the course of this project, a single release will be developed and transitioned to
production.
The following roles were extracted from the Proposal accepted by RISD. In general, the
assigned persons will be responsible for tasks requiring specific skill sets. When the
District resources change, the District Project Manager will ensure that these project
team members provide the transition handover to ensure that there is no delay in the
project activities or productivity level and ensure that coverage exists for any outstanding
assignments. If resource changes occur within the Oracle staff, Oracle Project Manager
will use the same approach to maintain continuity and avoid unnecessary delays in
progress. Please note that not all roles are full time for the project and in some cases the
same person may fill multiple roles.
Some tasks will require representation from various District functional areas. The District
will provide all necessary resources to support these tasks.
Steering Committee – The District and Oracle Team will provide steering committee
members. Steering committee members will provide project leadership, decide scope
issues, and resolve escalated issues.
District Project Manager – The District will provide a project manager whose chief
responsibility will be to act as liaison between Oracle Team and the District.
District User Groups – These resources will represent the end user communities and
provide requirements and functional expertise. They will also participate in the user
acceptance testing of the system.
Oracle Project Manager – This resource will serve as on-site Oracle project Manager,
and monitor the team’s efforts in tracking to the project workplan.
Oracle OWB Specialist – The Oracle OWB Specialist will be supporting the designing,
building, and testing of all processes involving the transformation and loading of data.
District OWB Specialist – The District OWB Specialist will be supporting the designing,
building, and testing all processes involving the transformation and loading of data. This
resource will be peer mentored with the intent that they can develop and maintain the
system going forward.
District Source System Specialists – The District will be an “as needed” resource (full
time at certain key junctures) for the data residing in the source applications. This person
will provide the data, related format information and answer questions to Oracle technical
consultants, as well as participate in data mapping activities. The District Source System
Specialists will be responsible for developing any data extracts on the source system.
There must be a source system specialist for each source system to be incorporated into
the Data Warehouse
District DBA/Oracle Designer – The District resource will support the designing of the
logical and physical data models, and tune database objects comprising the reporting:
data stores, transformation/load processes, and all other objects developed by the
technical team. This recourse will be peer mentored with the intent that they can develop
and maintain the system going forward.
District Network Administrator – The District will provide on a limited basis technical
support relating to the District’s intranet and the network administration.
District System Administrator – The District will provide on a limited basis technical
support relating to the District’s systems administration, and core Oracle database duties
(startup, shutdown, backups, and any other requisite system administration functions).
Oracle Data Access Specialist – Oracle will provide technical leadership and support
the District in the designing and building the End User Layer (EUL) and limited number of
reports, using Discoverer.
Campus EAI / District Portal Developer – Campus EAI / District will provide technical
leadership and support the District in the designing and building of the Enterprise portal.
Oracle Data Warehouse Portal Developer – Oracle will provide technical extensions to
the enterprise portal for the data warehouse in the area of security and Dashboard
integration.
District Portal Developer – The District resource will support the designing and building
the Data Warehouse portal using Oracle Portal. This resource will responsible for
developing and maintaining the system going forward.
District Training Specialist – The District resource will support the testing of the Data
Warehouse, and support the planning and developing of the training materials. The
District training specialists will receive coaching/mentoring from Oracle and they will be
responsible for training end users.
Schedule:
Schedule:
Date: Date:
Completion Verified By: Completion Date:
Last Updated:
Document Ref: RISD/DW/002
Version: 3.0
Approvals:
RISD
Oracle
RD.049 Portal Security Requirements Doc Ref: RISD/DW/002
Document Control
Change Record
3
Name Position
Distribution
Contents
Document Control............................................................................................. ii
Change Record............................................................................................ ii
Reviewers.................................................................................................... ii
Distribution ................................................................................................. ii
Introduction
This portal security requirement deliverable documents the Phase 1 Data Warehouse
(DW) report integration with Oracle Portal. Campus EAI will provide RISD with
assistance with the installation of Oracle 10g Application Server (10gAS) by providing a
repository of portal pages and portlets that will be used by the RISD portal development
team to support RISD requirements. Oracle Consulting will be responsible for the
content of the framework pages using the DW as inputs and the database security to
restrict data access by security group. The original scope of the baseline portal
implementation project plan is to provide data access to RISD employees, the BOT
(Board of Trustees), students, parents, and the general public. Campus EAI suggested a
phased approach where a subset of users will have access through portal before rolling
out access to all users.
The revised initial implementation approach will be to allow access to RISD employees
and the RISD board. All users in these groups will have access to summary assessment
The system will be designed to allow student, parent, and the general public access in
future releases. Security will be defined for Phase 1 to allow RISD to add these security
groups independent of the post Phase 1 DW implementation schedule.
• The BOT (RISD Board) will have access to summary information down to the
campus level. If RISD limits BOT access, a special BOT report menu may need
to be developed.
• District administrators will have access to all data including student information
for both current and former students.
• Areas administrators will have access to all aggregate data but only access to
student information for students in their area (current and former students).
• Schools administrators will have access to aggregate data but only access to
students currently in their campus.
• Teachers will have access to all aggregate data but only access to student
information for students currently in their classes.
• Parents will have access to all aggregate data but only have access to their
child’s assessment information. For simplicity, a parent will have separate Ids if
they have multiple children in district schools.
• Students will have access to all aggregate data but only have access to their own
assessment information.
Purpose
This document clearly defines the data integration between Oracle Portal and Oracle
Discoverer report information. Since the data security will be assigned at the database
level, data security will apply when accessing standard reports via portal, direct data
access through standard SQL, and Oracle data access through Discoverer Plus.
The Data Warehouse Portal Security Design Document describes the functionality of the
portal pages required to access portal dashboards and Data Warehouse reports by user
security group. The look and feel, navigation colors and graphics and the like for the
portal are not part of these requirements; only the Data Warehouse page content, report
links and how roles and location defaults will operate are included.
Background
The Data Warehouse environment will provide decision makers throughout the District
with information to help improve student achievement. Users will access the system via
Windows based Internet-capable computers and workstations to accommodate the
needs of both computer novices and experts. Discoverer report data can be downloaded
into local applications where appropriate (for example, spreadsheets and PC databases)
to perform additional analysis or to integrate it with local data.
Individual data access will be restricted to information associated with the user’s
responsibilities, needs, and skills. The majority of end users will run parameter driven
reports from a predetermined menu. However, dimensions such as location can be
changed to view different levels of detail. For example, a user may start with a district
view for a report and then “drill” to lower levels of the hierarchy such as area and
campus.
Some power users may go directly to Discoverer Plus for added functionality and to
Related Documents
Please note that some information in this document originated in the documents
listed above. This deliverable is intended to be understood without referencing
multiple documents but there are references to other documents when additional
detail is available.
Executive Overview
This section summarizes the portal data security requirements for the RISD Phase 1
Data Warehouse implementation. All users of the RISD data warehouse are required to
login to the system through Oracle Portal with the exception of the general public who will
be limited through the menu selection to a subset of aggregate reports. The user ID and
password management will be implemented to meet standard RISD security
requirements.
Portal Security
Initially, only RISD employees will be permitted access to the RISD data warehouse
environment. All users will be required to have access to the RISD intranet environment
to ensure full security.
Note: Only the District Assessment team can view preliminary assessment information.
That is, occasionally, data will be loaded before student matching and data cleansing is
Security Group Student Student Detail Teacher Teacher Aggregate Campus Area District
Detail Description Aggregate Description Aggregat Aggregate Aggregate
e
Student and Yes Only the individual No N/A Yes Yes Yes
Parent Student
Teacher Yes Current students, Yes Aggregates for the Yes Yes Yes
all assessments individual teacher
Campus Admin Yes All students Yes All teachers on campus Yes Yes Yes
(includes currently on but restricted to students
assistant campus and in the administrator’s
principal, incoming class campus since teachers
advisors and can teach in multiple
so on) campuses)
Principal Yes All students Yes All teachers on campus Yes Yes Yes
currently on but restricted to students
campus and in the principal’s campus
incoming class since teachers can teach
in multiple campuses
Area Admin Yes All students Yes All teachers in the area Yes Yes Yes
currently in the but restricted to students
area and incoming in the administrator’s area
class since teachers might
teach in campuses that
cross areas
Area Yes All students Yes All teachers in the area Yes Yes Yes
Superintendent currently in the but restricted to students
Area in the area
superintendent’s area
since teachers might
teach in campuses that
cross areas
Central Admin Yes No Restrictions Yes No Restrictions Yes Yes Yes
Central Yes No Restrictions Yes No Restrictions Yes Yes Yes
Superintendent
School Board No N/A No N/A Yes Yes Yes
Public No N/A No N/A Yes Yes Yes
Security Group Student Student Detail Teacher Teacher Aggregate Campus Area District
Detail Description Aggregate Description Aggregat Aggregate Aggregate
e
District Yes No Restrictions Yes No Restrictions Yes Yes Yes
Assessment
Team
Data
The following information is based on the Project Management Plan (PMP).
Student information will be updated daily in the data warehouse via RIMS information
updated daily in the Operational Data Store (ODS). PEIMS information will be updated
as available in the Operational Data Store (ODS) and the data warehouse. Data needed
to update the Data Warehouse will be accessed via a single database link to the ODS.
Please note that there is overlapping information in some of the source systems that may
be out of sync in the source systems, which is why RIMS was selected as the single
source for Phase 1. The intent of the ODS will be to develop business rules to have a
In some cases there are multiple administrations of a single test but even in cases where
there is only one administration, there are often supplementary assessment files of new
RISD students who took the test while in another district.
Please note that assessment files will be provided on an “as available” basis and
scheduled to update the database after a cleansing process has been performed. Since
dates assessment files become available vary greatly, all update process will need to be
scheduled. Dependencies will be noted in the documented production process. For
example, all students on an assessment file should already have a matching student
record. As noted earlier, a process to create a default student identifier is needed in the
ODS to ensure no assessment records are dropped. The DW ETL business rule will be
to add the record unless the student ID is invalid.
Reporting
The Subject Areas and Information Sets in the warehouse will be organized into multiple
report areas for Phase I. Data at the Campus level and above in the location hierarchy is
available to all users. The general public will not have to obtain a user ID and password
to access aggregate level by location. However, these reports may only be a subset of
all the available reports and will be controlled via portal. Please see the reporting
requirements document for additional detail on report requirements and specific
information on the business rules necessary to create the reports.
2. There will be aggregate reports at the student level. Only security groups with the ability
to access student level detail will be permitted to view student level information.
3. There will be aggregate reports at the teacher level. Only security groups with the ability
to access teacher information will be permitted to view teacher information. Please note
that teacher aggregates will be for all courses a teacher is responsible for even if they
cross campuses or areas. Course will not be carried in the data warehouse.
4. Detail reports will be available at the individual student level. A detailed student level
report is needed to provide information on a particular test for a student. Defining a
common detail report by assessment area across all grades will avoid creating a custom
report for each grade. In some cases, reports across test area will have the same
formats (for example, TAKS and TEKS). Some areas such as PSAT and SAT may not
have student detail reports since students are notified directly for their PSAT and SAT
results. Also, a student summary report of all assessment areas might also be needed.
The following dimensions were identified for common reporting. Please note that all
dimensions will not be required in every report and some reports may require additional
dimensions. However, the general rule is that all of the below dimensions will be
permitted in every Discoverer report but some dimensions will be static and potentially
“hidden” from end users. That is, the dimensions would be available for analysis with no
changes to the report but might not be displayed since general end users would normally
use the default.
Type of report – State report using state supplied student demographic data or Local
report using RISD student information. At this time there are no requirements to
create a state and local version of the same report.
Test Area – Subject area for an assessment, example Math (Assumption, teacher
can view multiple subject areas)
Teacher – Teachers are only allowed to view students who are currently in their
classes and teachers are not permitted to view data on former students.
Economic Status
Ethnicity
Gender
Course number and course period will be needed to identify current teacher-student
RISD currently has a different Tool or “system” to create reports in each test area. When
cross assessment reports are required, the assessment team physically combines data
into a single file for reporting. This will not be necessary in the integrated environment.
Elementary school principals can select report details by teacher by student. Secondary
campus Tools can currently select report detail on students by course, teacher, and
period but enrollment data (course and period) is not in scope for Phase 1.
1. RISD is responsible for the overall portal design with the assistance of Campus EAI.
2. Oracle Consulting will assist with the high level portal design and develop the
interface requirements between portal and the Discoverer reports.
3. Oracle may be responsible for the content of some Discoverer reports that are used
in the portal dashboard (depending on what reports are selected). It is likely that
RISD will include reports not generated from the data warehouse to show more
volatile information such as school attendance. Since the Phase 1 release of the
data warehouse focuses on student assessments, this information may not be
available in the Data Warehouse.
Portal Requirements
The Oracle team will implement database level software security based on the following
user roles.
An Oracle Form will be created to allow a RISD administrator to add a user to an existing
security group (listed above). For example, for small user groups such as central staff
with ad hoc capabilities and the board, it may be easier to add and delete users online
rather than develop an interface. The creation of new user groups is out of scope for
this project.
Conversion Files
Schedule Unmatched
Assessments ODS Corrections
Updates Student Detail
District District
Assessment Automatic
Periodic Updates Assessment Team
Team Monthly
& Summarizations
Snapshots
SIS
Assessment Dashboard
Power User
Assessment Portal Links
General Area (Logo, messages, etc.
Home Page
Assessment Dashboard
Public or District
Employee
This section defines the functionality needed to support the RISD Portal Requirements
for Data Access. The design and the placement of data on the portal pages is the
responsibility of the RISD Internet project manager. The DW development team will be
responsible for supplying content for some portal pages and for creating the VPD users
groups to provide data warehouse data security.
Please note that the portal menus will control the report links for specific user groups to
determine the reports users are permitted to run. Data security can only restrict users
access at a data level and not at a report level.
Login Page
The RIDS Internet project group will create a URL similar to the one below to access the
initial login page:
The login page will show RISD general information in addition to having a login portlet.
For security reasons, users may not be permitted to self-register. RISD will provide an
interface file to the DW group to assign users to security groups. Each user will be
assigned a password that is input into the Oracle Internet Directory (OID) by RISD
Internet personnel. The OID login uses the login user ID when accessing the DW to
internally set security.
Dashboard Page
After Validation of the Login Page, the user will go to the Dashboard Page. The generic
dashboard page will have up to 4 portlets that will dynamically show current information.
One of the four portlets will show the Assessment data that is available in descending
order with the most current assessment at the top of the list. That is, the most recently
processed file will appear at the top of the list since that is the file that will most likely be
of interest to users. In all, 12 assessment areas will be available. The information simply
shows the assessment area (TAKS, TEKS, K-2 Math, and so on) with the assessment
administration date, and the date of the last Data Warehouse update. The information for
the portlet will come from the Assessment Log file. Each time a new assessment file is
processed, a log file record is created that notes the assessment, assessment
administration date, the process date, and whether the update is preliminary of final. The
ODS will create the initial log record in order to trigger the DW update. The DW ETL
process will update the record to indicate the extract was run with the completion code.
Only final information with successful update codes will be displayed for general users.
The user can get to the assessment specific report menu page by clicking on the
assessment file listing in the dashboard portlet. Only the reports for the selected
assessment will appear on the Report Menu Page. AYP reports will be included with
TAKS.
Assessment administration – This selection provides the Year and the administration with
the year if an assessment area has multiple administrations. The default for this value
will be the current assessment year and administration, the cumulative or summary
report for assessments having multiple administrations, or the single or “final”
administration for assessment only having one administration.
There are state reports that use state supplied student demographics and local reports
that use RISD student information. Most reports either use state supplied information or
local information but there may be some reports in the future that will allow the end user
to chose the state or local option.
There are many additional parameters that are allowed depending on the specific report
(see section on common reporting requirements). Rather than control the report
selections within portal, portal will pass control to the Discoverer reports that will allow
additional functionality. The functionality within each report is described in the report
requirement specifications. However, the default parameter will be based on the security
group for the location default.
Portal Lists
Reports Available
for an Assessment
Select Report
And Parameters
Discoverer Control
Discoverer
Drill to Lower Drill down by
Assessment
Levels selected Hierarchy
Aggregate Report
Student Aggregate
Link to Student
Report by
Aggregate Report
Assessment
Link to a specific
Student Assessment Summary
Virtual Private Databases (VPD) will be used to maintain security at the database level.
Every user will be assigned to a security group that will limit database access based on
the business needs of users assigned to the security group. RISD will assign users to a
security group and create a process to assign a password to each new user. The user
will be required to update the password with the initial sign on and update passwords on
a periodic basis as per RISD internal requirements. The diagram below shows the
security data model that is documented in Oracle Designer and part of the logical and
physical database design.
D ACCESS LEVEL TYPE PUBLIC TYPE CENTRAL SUPERINTENDENT TYPE AREA SUPERINTENDENT TYPE PRINCIPAL
# ID
* CODE
* DESCR TYPE SCHOOL BOARD TYPE CENTRAL ADMINISTRATORS TYPE AREA ADMINISTRATORS TYPE CAMPUS ADMINS
granted to require
D TIME
SCHOOL YEAR SCHOOL SEMESTER
* YR ID
* ID
* YR SCHOOL YEAR
* CODE
...
...
D DISTRICT PEOPLE
D STUDENT # ID
accessible during
o EMPLOYEE ID
given during # ID
* RISD ID o LAST NAME
* PEIMS ID F STUDENT ACCESS o FIRST NAME
F STUDENT SCHEDULE o FIRST NAME on o MIDDLE NAME
o MIDDLE NAME
taken by * LAST NAME for
o ADDRESS
takes o CITY
o ZIPCODE
o BIRTH COUNTRY
o COUNT IN CLASS OF FLAG associated with
o DOB
taken by taughtgiven
by at o GRADE FIRST ENROLL
o GRADE MOST RECENT ENROLL
D CLASS granted to
# ID r
next year location focurrent year access to
* CODE D TEACHER
* ID
* NAME teaches * EMPLOYEE NUMBER
view D LOCATION
taught by o FIRST NAME # ID
o MIDDLE NAME
teach have
o LAST NAME D DISTRICT
occurs in for o HIRE DATE o DIST ID
o RISD YEARS o DISTRICT NAME
occurs during o TEACHING YEARS o DISTRICT DESCR
on
D AREA
D CLASS PERIOD D SUBJECT
o AREA ID
# ID D COURSE o NAME
* CODE # ID given for # ID
o DESCR
* CODE * CODE
o DESCR
* PERIOD NUM * DESCR taught in * NAME
D CAMPUS
o LEVEL o DESCR
o DEPARTMENT * CAMPUS ID
o AP IND o PEIMS CAMPUS N
* CAMPUS NUMBER
...
location for
The OID directory must interface with the database in order to associate Portal single
sign-in user IDs to the Oracle Database security groups. Oracle’s Virtual Private
Databases (VPD) feature ensures users within a security group only have access to data
as defined in the previous section. The diagram below shows the relationships between
the security portal and the Data Warehouse.
Process to Assign/
change User
Passwords
Process to assign
Default Password
to new users
OID
On Line Form To
add/Remove
Users from
Security Groups
1. RISD creates the initial Login Screen using portal pages available in the Campus EAI
repository. This initial page is often referred to as the blackboard. Portal needs to
communicate the specific user to the database to ensure proper data security. The
portal software validates the user ID and password. We will set up a process where
RISD would set up a default user ID and password for all employees and the board
with a randomly created password set to expire immediately. When the user logs on
for the first time, the system would prompt them to change their password after
authenticating the user. The potential problem with this approach would be the
security associated with notifying employees of their new user ID and password. A
possible alternative would be to have employees and the board self register based
on knowing key employee information. Employees must be assigned by Oracle to
the proper security group via the HR employee interface file and students or parents
(in the future) will be in the student security group. Public users will not need to login
but will have access to the initial public page that would include the login portlet.
2. After login, the user would go to the Dashboard page where several portlets would
supply dynamic information. For example, one of the portlets would show the
assessment files that were available with the most recent updates first. Clicking on
one of the assessment areas would link the user to the page that lists all related
reports for that assessment area. Reports that compare results across multiple
assessment areas would appear in multiple assessment menus. For example, the
report that compares TEKS and TAKS results would appear in both the TEKS and
the TAKS menus. However, after selecting the time periods for report comparison,
the portal would only allow the report to run if BOTH data files were available for the
time period.
3. For login security groups, the initial Report Page shows all reports available for the
security group (the board might be restricted to a subset of aggregate reports). That
is, if the board only has access to a subset of reports, they would only see reports on
their menu that they were permitted to run. Only the district assessment group would
have access to “preliminary” assessment data for report creation. After the “final”
assessment numbers are loaded, all users will have access to the data.
4. The user must select the assessment test area. “SUM” (currently defined as ALL)
might be allowed if a report summary was available across subject areas. In some
cases like SAT reports, only a combined score report is available so selecting the
test area would not be necessary. However, if subject area reports were available in
the future, this prompt could be added. For consistency, this prompt would show for
all report options but in the case of SAT reports, only one option would show.
5. The default time is the latest test year that has “final” results but the user can change
Approvals:
RISD Data Model
Document Control
Change Record
2
Name Position
Distribution
1 Library Master
2
3
4
i
RISD Confidential - For internal use only
RISD Data Model
Contents
ii
RISD Confidential - For internal use only
RISD Data Model
1.0 Introduction
1.1 Background
The RISD Data Warehouse environment will provide decision makers throughout the District with
information to help improve student achievement. Users will access the system via Windows
based Internet-capable computers to accommodate the needs of both computer novices and
experts. Data that is currently processed independently in multiple “applications” will be
integrated into a single environment to facilitate data reporting and analysis across test areas.
Discoverer report data will be available online and can be downloaded into local applications
where appropriate (for example, spreadsheets and PC databases) to perform additional analysis
or for integration with local data. Phase I will primarily focus on student assessment data.
User groups will be restricted to accessing information associated with their responsibilities,
needs, and skills. The majority of end users will run parameter driven reports to obtain multiple
views of the assessment data. Dashboard reports via Oracle Portal will provide summarized
1.2 Purpose
The Data Model provides a definition and structure of all the data that RISD personnel will need
to satisfy their student achievement business requirements. The generic Logical Data Model
(LDM) created for use at K-12 applications was used as the starting point in creating the RISD
data model and the generic model is available to RISD in Oracle Designer and will be used to
add subject areas in future releases.
The RISD specific Logical Data Model representing the Phase 1 data the RISD assessment team
uses or generates is documented in a separate Designer Repository and the ERD can be found
in section 5.0. The model reflects information from the generic K-12 data model customized with
information provided by RISD. The Oracle team reviewed the business relationships with the
RISD Assessment team to ensure completeness. The RISD model contains all the information
that the RISD data warehouse is anticipated to need in Phase 1 to support assessment
requirements. This deliverable includes Entity Relationship Diagrams and standard Entity and
Attribute reports from Oracle Designer.
The Oracle Designer repository will be used in conjunction with Oracle Warehouse Builder
(OWB) in the implementation of Phase 1. The phased implementation approach adds subject
areas to the student assessment data warehouse in increments order to ensure the highest
priority business requirements are addressed in early releases.
3
RISD Confidential - For internal use only
RISD Data Model
4
RISD Confidential - For internal use only
RISD Data Model
The three main concepts used in data modeling are the following:
Entity – Something of significance to the business about which information is held. It can be an
object, person, place, thing, activity, or concept that is important to the business. Every entity
should have a unique means of identification (unique business name or in physical terms, a
unique key) and one or more other characteristics (relationships or attributes) that are captured to
describe it.
A Student is an example of an entity. In the diagram, entities are represented as boxes with
rounded corners. The size, color, and shape of the boxes are of no particular significance but
they are often used to assist with clarifying the diagram.
A line that joins two entity boxes together illustrates a relationship between the entities. There are
two relationship aspects that are depicted by the line: mandatory/optional associations and
cardinality.
Mandatory/optional associations: A solid line from an entity box means that an occurrence of that
entity must be associated with an occurrence of the entity at the other end of the line. A broken
line from an entity box means that an occurrence of the entity may be associated with an
occurrence of the entity at the other end of the line. Each end of the line is given a title that
describes the nature of the relationship for the entity at that end of the line.
5
RISD Confidential - For internal use only
RISD Data Model
Cardinality: A further convention is the use of a crow’s foot at the end of the line. This indicates
that the association is with one or more occurrences of the type of entity to which the crow’s foot
is attached. When the crow’s foot is absent it means that only one occurrence of the entity at that
end of the relationship is involved in each occurrence of the relationship.
For example, the following extract shows the relationship between a Student and the student
access facts.
D STUDENT
# ID
* RISD ID
* PEIMS ID F STUDENT ACCESS
o FIRST NAME
o MIDDLE NAME on
* LAST NAME
o ADDRESS for
o CITY
o ZIPCODE
Attributes are qualitative, quantitative, narrative or descriptive (includes audio, video, image, and
so on) characteristics of an entity. Attributes should be consistent in meaning and have only a
single definition. (In physical design, considerations may be made to combine mutually exclusive
attributes into a single column, but this is not a recommended practice). Attributes can have a set
of predefined values (called domains) or other set of restrictions (range edits, cross-field edits)
that are constraints of the attribute.
6
RISD Confidential - For internal use only
RISD Data Model
7
RISD Confidential - For internal use only
RISD Data Model
The concept of a Dimensional schema as represented by its popular rendition called the Star
schema or a variation, the Snowflake schema, is derived from multidimensional database design.
In this model, the Star has a central fact table radiating to several dimensional tables. Each star is
designed as a central, usually large table of facts typically recording a particular type of event.
Each event or fact occurs within the context of several dimensions. For example, when
considering transaction facts, you might note that each one of these facts occurred on a
Fact tables commonly hold the largest collection of data in the data warehouse. To facilitate an
efficient way of handling these large volumes of data, it is usually recommended that the fact
tables be partitioned. Partitioning will be determined in the Physical design.
As mentioned earlier, each row in a fact table has a column corresponding to the primary key of
each of the dimension tables in the star. In addition to these foreign key columns, the fact table
contains one or more columns that describe the volume, frequency, assessment score, or other
numeric measure that can be summed, averaged, or aggregated in a query. Except for the
foreign keys, virtually all attributes in a fact table turn out to be measurements that can be
meaningfully summed. Textual attributes are generally of little value in fact tables because they
cannot be arithmetically combined when a query seeks to retrieve many rows. In reality, not all
numeric data types are useful in the fact table.
In many cases, Fact Summary tables are created to help improve performance. The contents of
these fact summary tables are determined by the access (reporting) requirements.
Dimension tables tend to be relatively small when compared to the fact tables. Dimensions may
hold from a dozen to a few thousand rows of data. The objective of the star schema is to be able
to efficiently select a subset of the total fact table by restricting the number of fact rows through
limiting conditions specified about the dimensions.
Dimension tables hold all the descriptive attributes in the star schema rather than the fact table.
For example, the assessment name that did not belong in the fact table will comfortably find a
home in the dimension table. The number of distinct values in a dimension table is critical to its
usefulness. A dimension with only a few values is not particularly selective. On the other hand, if
values in the dimension table approach the number of rows in the fact table, the dimension loses
its value. In the ideal situation, though one dimension may not be particularly selective, a
8
RISD Confidential - For internal use only
RISD Data Model
combination of criteria involving multiple dimensions can identify a useful subset of the fact table
for analysis. The dimension tables are not generally normalized. Disk space savings gained by
normalizing the dimension tables are trivial compared to the total size for the star schema.
Another crucial factor is determining the grain statement of the fact table. For example, the grain
may be student detail at the assessment test area.
There are dimensions where the information about them changes slowly over time. These
dimensions are called slowly changing dimensions. Since RISD has expressed a need to track
changes over time the Data Warehouse Team will create an additional record at the time of the
change with the new attribute values so that we can segment the data historically between the
old description and the new description.
3.4 Time Dimension
In a data warehouse environment, virtually all data in the fact tables is dimensioned by time.
Therefore, the granularity of the time dimension has significant implications on the way end users
slice and dice data in the fact tables. The time, therefore, merits a special, short discussion. The
time dimension needs to be defined in a way that takes into consideration end users’
9
RISD Confidential - For internal use only
RISD Data Model
The following sections show the high level Logical Data Model (LDM) relationships. Please refer
to the entity attribute listing for full list of attributes.
10
RISD Confidential - For internal use only
RISD Data Model
D ACCESS LEVEL
TYPE PUBLIC TYPE CENTRAL SUPERINTEN TYPE AREA SUPERINTEND TYPE PRINCIPA
# ID
* CODE
* DESCR TYPE SCHOOL BOA TYPE CENTRAL ADMINISTRAT TYPE AREA ADMINISTRATO TYPE CAMPUS ADMIN
granted t require
D TIME
SCHOOL YEAR
* YR ID
SCHOOL SEMESTE
* ID
* YR SCHOOL YEAR
* CODE
...
...
D CLASS granted t
# ID current year acces
next year location
* CODE D TEACHER
* ID
* NAME teaches * EMPLOYEE NUMBER
view D LOCATION
taught by o FIRST NAME # ID
o MIDDLE NAME
teach have
o LAST NAME D DISTRICT
occurs in for o HIRE DATE o DIST ID
o RISD YEARS o DISTRICT NAME
occurs durin o TEACHING YEARS o DISTRICT DESCR
on
D AREA
D CLASS PERIO D SUBJECT
o AREA ID
# ID D COURSE o NAME
* CODE # ID given for # ID
o DESCR
* CODE
o DESCR * CODE
* DESCR taught in * NAME
* PERIOD NUM
o DESCR D CAMPUS
o LEVEL * CAMPUS ID
o AP IND o DEPARTMENT
o PEIMS CAMPU
* CAMPUS NUM
...
location fo
11
RISD Confidential - For internal use only
RISD Data Model
D TIME
SCHOOL YEAR CALENDAR DATE RISD AYP TARGET VALUES
* YR ID * CAL ID o PATICIPATION LEVEL
ADMINISTRATION PERIOD * YR SCHOOL YEAR * CAL DATE
D ASSESSMENT * ADMIN PD ID * YR SCHOOL YEAR DESCR
# ID * CAL MONTH
* ADMIN PD * YR START DATE
* CODE ...
* ADMIN PD DESCR * YR END DATE
* DESCR * ADMIN MONTH * YR START YEAR
o ADMINISTRATION * ADMIN YEAR for
* YR END YEAR for
* PERIOD START DATE
of D TAKS EXIT LEVEL STANDARD * PERIOD END DATE SCHOOL SEMESTER
* ID
(DD?)
# ID taken on for ...
* CODE
for * DESCR for
FS OBJECTIVE ITEM
for o ITEM NUM
o CORRECT RESPONSE
level of o RESPONSE CNT
during
in
in
12
RISD Confidential - For internal use only
RISD Data Model
D TEKS CLASS
D TIME # ID
groups for
13
RISD Confidential - For internal use only
RISD Data Model
FS SDAA D TIME
* TESTED CNT
o MET EXPECTATION CNT CALENDAR DATE ADMINISTRATION PERIOD D SDAA SCORING DD
SCHOOL YEAR * ADMIN PD ID
o NOT MET EXPECTATION CNT * CAL ID * ID
* YR ID
o MET EXPECTATION PCT * CAL DATE * ADMIN PD * CODE
* YR SCHOOL YEAR
o NOT MET EXPECTATION PCT ... ... * DESCR
...
D SDAA ACHIEVEMENT DD
for for
for
for
D ARD DECISION RESOLVED DD level for
D ASSESSMENT taken during taken on taken during level for
# ID D STUDENT SDAA
o ESC_REGION_NUMBER D ARD DECISION GRIDDED DD
* CODE
o COUNTY_DISTRICT_CAMPUS F SDAA STUDENT RESULTS
* DESCR resolved level in
o LAST_NAME * SEQ level for
o ADMINISTRATION
o FIRST_NAME * DISTRICT CNT
o MIDDLE_INITIAL on gridded level in
* CAMPUS CNT
of o STUDENT_ID
F SDAA STUDENT TEST AREA achievement level in
on
... for RESULTS D LOCATION
for
o RAW SCORE at # ID
o SCALED SCORE
D TEST AREA o ACHIEVEMENT LEVEL accountable in for D DISTRICT
# ID o ARD DECISION GRIDDED o DIST ID
* CODE o ARD DECISION RESOLVED accountable for o DISTRICT NAME
o NAME o MET EXPECTATION IND for o DISTRICT DESCR
o PASSING SCALE SCORE o NOT MET EXPECTATION IND
on
o PASSING RAW SCORE o AT EXPECTATION IND D STUDENT D AREA
14
RISD Confidential - For internal use only
RISD Data Model
D GRADE
# GROUP ID for
* GROUP CODE D RPTE
* GROUP DESCR D TIME PROFICIENCY
CALENDAR DATE ADMINISTRATION PERIOD RATING
D TEST GRADE SCHOOL YEAR * CAL ID * ADMIN PD ID * ID
# ID * YR ID * CAL DATE * ADMIN PD * CODE
for * ASSESSMENT ... ... ... * DESCR
* CODE
* DESCR for for for
o SORT SEQ
on on
D ASSESSMENT
# ID taken during taken on taken during level for
* CODE D YRS IN SCHOOL D LOCATION
for # ID # ID
* DESCR
o ADMINISTRATION F RPTE RESULTS * TYPE
# SEQ * CODE D DISTRICT
o DISTRICT CNT band years* oDESCR o DIST ID
for o CAMPUS CNT * LEVEL START
of for
o DISTRICT NAME
for * LEVEL END o DISTRICT DESCR
F RPTE STUDENT TEST AREA RESULTS at
o TESTED CNT
accountable in D AREA
D TEST AREA o RAW SCORE for o AREA ID
during
grouped
at by
15
RISD Confidential - For internal use only
RISD Data Model
4.6 AP Assessment
FS AP
D TIME for o TESTED CNT
SCHOOL YEAR o TESTS TAKEN CNT
* YR ID taken during o AT OR ABOVE CNT
CALENDAR DAT * YR SCHOOL YEAR o AT OR ABOVE PCT
* CAL ID * YR SCHOOL YEAR D ADMINISTRATION PERIO for o MULTIPLE TEST CNT
* CAL DATE * YR START DATE * ADMIN PD ID o SCORE 1 CNT
* CAL MONTH * YR END DATE * ADMIN PD o SCORE 2 CNT
* ADMIN PD DESCR
taken during
* CAL YR * YR START YEAR o SCORE 3 CNT
* YEAR START DATE * YR END YEAR * ADMIN MONTH o SCORE 4 CNT
... ... o SCORE 5 CNT
o UNKNOWN SCORE CNT
for for o SCORE 1 PCT
for o SCORE 2 PCT
o SCORE 3 PCT
D LOCATION o SCORE 4 PCT
# ID
FH STUDENT HISTOR o SCORE 5 PCT
# ID o UNKNOWN SCORE PCT
o DISTRICT START DATE D DISTRICT for
profile for taken on o DIST ID
o DISTRICT DEPARTURE DA taken durin
o CAMPUS START DATE o DISTRICT NAME taken a
o CAMPUS DEPARTURE DAT o DISTRICT DESCR
16
RISD Confidential - For internal use only
RISD Data Model
17
RISD Confidential - For internal use only
RISD Data Model
D TIME
ADMINISTRATION PERIOD CALENDAR DATE
SCHOOL YEAR * ADMIN PD ID * CAL ID
* YR ID * ADMIN PD * CAL DATE
* YR SCHOOL YEAR * ADMIN PD DESCR * CAL MONTH
... ... ...
D GRADE
# GROUP ID
* GROUP CODE
for D LOCATION
* GROUP DESCR
# ID
taken on
D TEST GRADE D ENROLLED GRADE D DISTRICT
# ID
# ID o DIST ID
* ASSESSMENT F LDAA STUDENT RESULTS
* CODE
* CODE on o DISTRICT NAME
o DESCR o SEQ o DISTRICT DESCR
* DESCR o READING TEST METHOD
o SORT SEQ for
o SORT SEQ o READING MET CRITERIA D AREA
o MATH TEST METHOD
taken at o AREA ID
groups for
18
RISD Confidential - For internal use only
RISD Data Model
D TIME
FS SAT REASONING
o GRADS CNT grouped by
o TESTED CNT for for
o TESTED PCT group for for
o SAT TOTAL SCORE
o SAT MEAN SCORE FS STUDENT SAT HI SCORE
o AT OR ABOVE CNT o MATH SCORE
o AT OR ABOVE PCT o VERBAL SCORE
o WRITTEN SCORE
o ESSAY SUBSCORE
o MC SUBSCORE
o COMPOSITE SCORE
o HIGH SCORE SAT
o HIGHEST MATH SCORE
o HIGHEST MATH SAT
...
19
RISD Confidential - For internal use only
RISD Data Model
20
RISD Confidential - For internal use only
RISD Data Model
D TIME
D ASSESSMENT D GRADE
# ID # GROUP ID
* CODE taken on taken during
* GROUP CODE
* DESCR * GROUP DESCR
o ADMINISTRATION
F K2MATH STUDENT taken during D LOCATION
D TEST GRADE D TEACHER # ID
RESULTS * ID
# ID for * SEQ * EMPLOYEE NUMBER
of * ASSESSMENT
* DISTRICT CNT o ADVISOR NUMBER
D DISTRICT
* CODE o DIST ID
for * CAMPUS CNT taught by o FIRST NAME
o DISTRICT NAME
* DESCR
o MIDDLE NAME
o SORT SEQ for F K2MATH STUDENT TEST teaches o LAST NAME
o DISTRICT DESCR
21
RISD Confidential - For internal use only
RISD Data Model
D TIME
ADMINISTRATION PERIOD
* ADMIN PD ID RISD ERWA TARGET SCORE
SCHOOL YEAR ...
...
CALENDAR DATE
...
for
grouped by
in of in grouped by at
FS ERWA RESULTS
o TESTED CNT
o MET BENCHMARK DRA CNT
o MET BENCHMARK DRA PCT
o MET BENCHMARK TPRI CNT
o MET BENCHMARK TPRI PCT
o MET BENCHMARK WRITTEN CNT
o MET BENCHMARK WRITTEN PCT
o MET 1 OF 3 CNT
...
22
RISD Confidential - For internal use only
RISD Data Model
D TIME
ADMINISTRATION PERIOD RISD TEJAS LEE TARGET SCORE
* ADMIN PD ID
SCHOOL YEAR ...
...
CALENDAR DATE
for ...
D LOCATION
# ID
for for
D DISTRICT
for o DIST ID
taken on o DISTRICT NAME
D GRADE o DISTRICT DESCR
# GROUP ID
* GROUP CODE F TEJAS LEE STUDENT D AREA
* GROUP DESCR RESULTS at o AREA ID
* SEQ o NAME
D TEST GRADE * SECTION 1 on o DESCR
# ID for o SECTION 2
* ASSESSMENT
* CODE
o SECTION 2A D STUDENT D CAMPUS
in o SECTION 2B # ID * CAMPUS ID
* DESCR for o SECTION 3 * RISD ID o PEIMS CAMPUS NUMBER
o SORT SEQ
o SECTION 4 * PEIMS ID * CAMPUS NUMBER
for o SECTION 5 o FIRST NAME * CAMPUS NAME
o SECTION 6 o MIDDLE NAME o CAMPUS DESCR
o SECTION 6A * LAST NAME o ADDRESS LINE
o SECTION 6B o ADDRESS o CITY
o OBJ 5 o CITY o STATE
o SECTION 8 o ZIPCODE o ZIP CODE
D TEST AREA
23
RISD Confidential - For internal use only
RISD Data Model
While reviewing the access requirements when creating the Logical Data Model, it often becomes
clear that summary tables will be needed. Summary tables and “views” of the data are associated
with the physical database design and are used to improve performance or simplify end user data
access. These “summary” tables may be implemented physically as materialized views but those
decisions will be made during the physical database design. Some summaries have been
identified and they can be identified in the entity list because they are prefixed with FS (Fact
summary).
Business rules for the creation of derived fields in the summary tables may continue to be added
after the physical design is approved but before implementation is complete. The intent is to
precalculate derived fields that will be used multiple times in reporting and analysis to ensure
consistency across reports. These business rules are documented in the report requirements and
included in the data model for base table calculations. Base tables are simply the lowest level
tables in the model. Some business rules will only apply to summary tables and these rules will
24
RISD Confidential - For internal use only
RISD Data Model
Data Modeling is the process through which a logical structure is imposed upon the data
elements within a system. This logical structure standardizes the data, and thus becomes the
cornerstone of any information infrastructure. The data model provides a graphical representation
of how each piece of data relates to itself and others. This representation is known as an Entity
Relationship Diagram (ERD). The first step in creating the Data Model involves identifying and
defining attributes. Attributes are the lowest level of data that can be described, for example, last
name, badge number, building number. These attributes are grouped (using Information
Modeling techniques, such as Normalization) into entities. Usually, entities can be described as
nouns— employee, product, customer, sales-rep and so on. The next step would be to define the
relationships between and among the entities. Relationships are described as verb phrases, for
example, employees “work at” locations, customers “buy” products. There are several different
Business rules refine and enforce the definition of these relationships. For example, a business
rule may state that an employee “works at” 1 and only 1 location. Although this relationship might
initially have been defined as a many to many relationship, the business rule will eliminate that
possibility, thereby enforcing the desired 1 to 1 relationship.
The compilation and integration of all of the entities, attributes, and relationships referenced by a
system is called the logical model. After completing the logical model, the next step is to create
the physical model. The physical model represents how the logical model will be represented. At
a basic level, each entity in the logical model becomes a table in the physical model, and each
attribute within an entity becomes a column within that respective table and each relationship
becomes a constraint between the tables.
Each table has a primary key associated with it. A primary key is a unique identifier for the table.
Remembering that a table is just the physical manifestation of an entity, a primary key allows us
to uniquely identify a particular instance or occurrence of this entity. For example, the STUDENT
ASSESSMENT table contains all Student assessment details. A primary key into the STUDENT
ASSESSMENT, such as student Identifier allows us to uniquely identify an individual student.
Where there are relationships between two tables, data must exist to “tie” them together.
Depending on the type of relationship, the primary key from the “one” side of a relationship may
be transferred to the “many” side of the relationship. This creates a parent/child relationship with
“one” side, being the parent and the “many” side being the child side. The transferred primary key
from the parent is called a ‘foreign key’ in the child entity. Sometimes, the foreign key becomes a
component of the primary key set of the child entity. Once all the entities and relationships have
been defined and diagramed (therefore, the name Entity Relationship Diagram [ERD]), the logical
model is complete.
The physical data model is developed/generated from the logical model. Information found in the
physical data model but not in the logical data model includes objects such as, tablespaces,
sizing within the tablespaces, growth rates for the tablespaces, partitioning schemes, indexes,
and so on. Foreign keys are good candidates on which to place indexes. In addition, any place in
the data warehouse where retrievals for the end users would limit the returning rows should have
an index defined and created on it.
25
RISD Confidential - For internal use only
RISD Data Model
All of these objects discussed (entities, attributes, relationships, business rules, logical models,
physical models, and so on) are known as metadata. Metadata is often called “data about data”,
and collectively it provides a documented understanding of the data in a system. For example,
the entity student is metadata that describes a specific instance of a student. Each instance of
student also contains a set of additional information (attributes). All of this metadata is stored
within the repository. A repository is a central location where all metadata is catalogued,
documented, and stored. Every piece of data represented in any model produced by the Data
Warehouse design team will have its metadata stored in this central repository.
A.1 - Data Modeling Activities
The data modeling activities involved in building the data warehouse can, and will be separated
into the following activities:
⇒ Building the repository environment
⇒ Conducting end-user interviews (Functional Specialists)
⇒ Collecting existing metadata assets
⇒ Metadata Analysis/Quality of the current data
⇒ Capturing the Business Rules
After the completion of these tasks, the database designs, transformation logic, and end-user
definitions will be turned over to the data warehouse implementation team.
A.2 - Building the Repository Environment
In the creation of any data warehouse, metadata is one of the critical success factors. Metadata
comes in two flavors: technical metadata and business metadata. Technical metadata describes
the technical aspects of each piece of data—name, size, data type, valid values, and
transformation rules. Business metadata describes what each piece of data means (descriptions,
business rules, and calculations, computations, summarization, aggregations and so on).
Gathering and recording this metadata is of limited value, however, if there are no means to
dynamically maintain and reference it.
In order to provide a single place that describes the environment and all of its individual pieces, a
repository needs to be created and populated. Oracle Corporation will use Oracle Designer (part
of Oracle’s Internet Development Suite), to dynamically maintain and reference the RISD
metadata repository and the Oracle Warehouse Builder (OWB) repository. Collecting all of the
assets and storing them in a single repository will provide for a consistent environment for all
users, both business and technical, to ask questions about what data is available, where it was
obtained from, what manipulations, if any, it went through before it was put into the data
warehouse. Thus, the repository will serve as a tool to distinguish information from data. Data is
facts; information is the correlation of facts. Moreover, it is the understanding of the correlation,
as defined by the metadata, which makes data become information.
In a complex metadata environment with large modeling teams, model management and user
extensibility features become quite important. Oracle Designer’s repository helps in this regard
with access and version control, controlled sharing, object grouping, extract, load, merge, object
checkout, and check-in. It also has extensibility features through its application program interface
(API) and a facility to create new properties, objects and associations.
The architecture document describes in more detail how Oracle Designer will integrate with the
OWB Extract, Transform, and Load (ETL) tool.
26
RISD Confidential - For internal use only
RISD Data Model
The usability and usefulness of a data warehouse is completely dependent on the quality of the
data and information used during the design and build phases. Determining the current RISD
systems that constitute the most accurate data available helps us determine the “system of
record”. In the case of state assessment information, right or wrong, the information on the state
supplied input is considered the system of record. However, current or point-in-time student
demographic information may be more accurate for reporting “local” results. Gathering and
documenting the current data structures from the system of record that contains pieces of data
intended for the warehouse is critical. In our case, all data will be loaded into the Operational
Data Store (ODS) as the single source for the Data Warehouse. However, determining the data
sources and business rules needed to load data into the ODS, is critical to the success of the
Data Warehouse project.
27
RISD Confidential - For internal use only
ROY INDEPENDENT SCHOOL
DISTRICT
Approvals:
1 Document Control
1.3 Distribution
Copy No. Name Function
1
2
3
4
Note to Holders:
If you receive an electronic copy of this document and print it out, please write
your name on the equivalent of the cover page, for document control
purposes.
If you receive a hard copy of this document, please write your name on the
front cover, for document control purposes.
6 MATERIALIZED VIEWS.........................................................................................7
6.1 Discoverer End User Layer .............................................................................7
7 DATABASE OBJECTS ..........................................................................................8
7.1 Dimension Tables .............................................................................................8
7.2 Fact Tables ......................................................................................................12
7.3 Summary Tables .............................................................................................18
7.4 Other DW Tables ............................................................................................19
7.5 Sample DDL to Create Data Warehouse Indexes .....................................19
8 SAMPLE DATABASE DEFINITION LANGUAGE (DDL) ...............................21
RISD DW Physical Design Company Confidential - For internal use only iii
2 Introduction
2.1 Purpose
This Physical Database Design document defines how the Logical
Database Design approved by the RISD business community translates
to the physical environment required to support the RISD assessment
reporting requirements. The physical environment is designed to add
subject areas in subsequent phases of the DW project. The table
structures, indexes, partitioning schemas, and so on. are a subset of the
2.2 Background
The data warehouse team created the RISD Logical Data Model (LDM)
based on a generic data model created from similar K-12 organizations.
Specific RISD requirements and business rules provided by RISD
business users, technical users, and the District Assessment team
resulted in an enhanced model specific to RISD needs. The Phase 1
RISD logical data model has been designed to meet RISD initial
assessment reporting requirements but can be easily expanded to
include the additional subject areas included in the generic data model.
The requirements reflect current 2004 assessment inputs.
4.2 Characteristics
The database instance will have the following characteristics:
1. Since the DW size is relatively small we will initially create only two
tablespaces.
Tablespace for data (DWD)
Tablespace for indexes (DWI)
Note: Dev has the OWB Design Repository in OWBRTTAB for tables and
OWBRTIDX for indexes
2. OWB Tablespaces
OWB Design Repository – OWBTAB for tables and OWBTIDX
for indexes
OWB Runtime Repository – OWBRTTAB for tables and
OWBRTIDX for indexes
3. Tablespaces will be Oracle transportable tablespaces, which are
bitmapped, not dictionary managed.
4. Tablespaces can be spread across multiple mount points on the
physical array. This will ensure that there are no I/O problems
associated with mixing tables and indexes. This is not an issue in the
RISD hardware environment since each mount point is stripped
across multiple devices and multiple devices are used for each
tablespace. The concept of keeping tables and indexes in separate
tablespace grew out of the general need to provide additional I/O
capabilities. This was more important when disk striping and storage
array technology did not exist but is still considered best practice.
5. Production control processes will ensure data integrity and control
data warehouse processing logic.
Summaries will be used in some cases to support security. That is, all
users will be allowed to view aggregate data but there will be restrictions
at the student level. Materialized view created from student detail would
have the access restriction carried to the higher levels.
2 D_ADMINISTRATION_PERIOD A component of the Time dimension. It Not all periods are month/year types.
represents the Test Administrations periods. Some are beginning, middle and end
of year.
3 D_ASSESSMENT The Assessment Dimension contains the high In moving the assessment dimensions
level descriptive information for the from logical to physical consideration
Assessments. for access and loading was made.
Due to the nature of change to items,
objectives and test areas of an
assessment set, a level of snow flaking
has been left in the model for
implementation. Also, not all
assessments have all levels. However,
the individual tables have been
flattened and repeating groups held as
sequentially numbered columns.
example: D_OBJECTIVE_ITEM holds
all items for a particular objective.
Complete flattening the dimension was
considered, however, looking at TAKS
as an example: there are 5 test areas,
each test area can have up to 10
objects (but we would leave room for
15), each objective can have 20 items
(math has up to 60 items, reading has
up to 51 items). The row would be
extremely wide and although there are
reduced joins, fewer rows can be
retrieved.
4 D_CAMPUS The Location dimension holds the descriptive Handle assessments that have
details of the District's key Locations such as unknown (Non RISD) Districts and
Campus and Area. District is the highest level Campuses, by creating rows to handle
within the Location dimension. It Represents these states. A row for Unknown
the RISD school district. Area is a level within Campus and a Row for Unknown
the location dimension. It represents the Areas District should be created. Hierarchy
within the District. The Campus is a level in the combos: RISD, Unknown Campus;
Location dimension. It represents the district Unknown District, Unknown Campus.
schools. Relationships from Facts to Location
will indicate accountability. In addition
to the location supplied by the
assessment and the student’s location,
there will be one for Campus and
District accountability. Therefore, the
Location table will require rows to
handle nonaccountable relationships in
5 D_CAMPUS_TYPE Contains the Campus Types in the school These types should be synchronized
district. (Elementary, Junior High, High, and so in/with Locations (Campuses) and
on) Grades/Grade Groupings.
8 D_COURSE The course table contains the courses offered Although not modeled specifically,
by the District for each subject. there will be a column for each special
indicator when transformed to physical.
9 D_DISTRICT_PEOPLE This table contains the types and particulars of Teacher may also reside in this table
the people associated with the District. The
person type will determine the level of access
the person has to the data.
10 D_ECONOMIC_DISADVANTAGED
11 D_EDUCATION_GRADE The Grade Hierarchy is the upper level of the Grade was initially modeled as two sub
grade dimension. It contains groupings such as entities: Test Grade and Enrolled
Pre-K, Grade. Analysis of values and
K-12, discussions with the Assessment
Post 12, and so on. Team, allows it to be created as a
K-12 is the grouping most focused for RISD single entity (table). To avoid a more
reports. complex structure for the user, a grade
The Grade dimension contains data regarding hierarchy (balanced) of three levels is
the type of grade groupings within the District for being created. Since reports looking at
which reporting will occur. Examples are: ALL grades deal with only K-12, the
ELEM = Elementary 'ALL' level of the hierarchy for report
JH = Junior High School purposes will be created as 'K12'.
HS = High School Additional top-level nodes will exist for
Grade Grouping may be considered Campus the other grade groupings.
(School) Type.
NOTE: SDAA requires slightly different
arrangement (structure?) for its representation of
instructional grade.
The School Grade dimension contains the
Student Grade levels offered by the District.
The Grade may correspond with an equivalent
Test Grade Level.
Examples:
K = Kindergarten,
1 = First Grade
...
12 = 12th Grade.
12 D_ETHNICITY
13 D_GENDER
14 D_ITEM_RESPONSE_GROUP Contains response groupings such as 'A/F',
'B/G', and so on
15 D_LEP
18 D_PERFORMANCE_GROUP This table holds the performance groups for Applies to TEKS. Question if applies
student results. to TAKS or other assessments.
20 D_SPECIAL_ED
21 D_STUDENT This is the base Student dimension table. It Source is primarily ODS
contains the core student information for which SIS_STUDENT. Rows are also loaded
24 D_STUDENT_GROUP The Student Group entity contains attributes by Currently a super-entity of student
which students are grouped. groups. Each primary subgroup may
be implemented as individual tables.
The super entity may also be
implemented for additional flexibility.
27 D_STUDENT_SDAA
31 D_TAKS_EXIT_LEVEL_STANDARD Holds the codes for the Passing Standard Maybe physical as degenerative
applied to the student’s scores. The TAKS exit- dimension.
level standard in place at the time a student P = Panels’ Recommendation
begins Grade 10 is the standard that will be 1 = 1 SEM*
maintained throughout the student’s high school 2 = 2 SEM*
career * Standard Error of Measurement
The following is a list of fact and fact history base tables. As part of the
analysis for satisfying the security requirement for accessing preliminary
assessment data, preliminary fact tables were considered. Although the
ETL code is a little more complicated and updates will take a little
longer, it was decided to include an indicator in the fact tables to decide
if the record is preliminary or final. Maintaining multiple fact tables
would have increased DBA maintenance and would potentially impact
performance.
3 F_ERWA_STUDENT_RESULTS The F ERWA RESULTS fact table contains the Accountability may be identified
detailed test results for a student. through relationships with Location. 1
for campus, 1 for district.
4 F_K2M_STUDENT_ITEM_RESULTS The F K2MATH RESULTS fact table contains Accountability may be identified
the detailed test results for a student. through relationships with Location. 1
for campus, 1 for district. The
sequences in the K2 Math results
tables will be synchronized with each
other. K2M_SEQ provides the
sequence and will be consistent for a
source row across TA, OBJ and ITEM
results tables.
5 F_K2M_STUDENT_OBJ_RESULTS The F K2MATH RESULTS fact table contains Accountability may be identified
the detailed test results for a student. through relationships with Location. 1
for campus, 1 for district. The
sequences in the K2 Math results
tables will be synchronized with each
other. K2M_SEQ provides the
sequence and will be consistent for a
source row across TA, OBJ and ITEM
results tables
6 F_K2M_STUDENT_TA_RESULTS The F K2MATH RESULTS fact table contains Accountability may be identified
the detailed test results for a student. through relationships with Location. 1
for campus, 1 for district. The
sequences in the K2 Math results
tables will be synchronized with each
other. K2M_SEQ provides the
sequence and will be consistent for a
source row across TA, OBJ and ITEM
results tables.
7 F_LDAA_STUDENT_RESULTS The F LDAA STUDENT RESULTS fact table Data will also be taken from TAKS,
contains the detailed test results for a student. SDAA sources, in addition to the
LDAA administered locally.
Accountability will be identified through
relationships with Location. 1 for
campus, 1 for district. Decide time
relationships
8 F_PEIMS_LEAVERS Contains the Students that have left the district This table started out as a graduate’s
and the reason as to why they left. Graduates table (F PEIMS GRADUATES) but
are designated in the Graduate Cnt column evolved to the Leavers table. The
(1=graduated, 0=did not graduate). Reason(s) for Leaving will be carried
as individual columns in the fact table
(degenerative dimension style) as
opposed to a dimension. One reason
is carried initially. If more reasons are
required, add a column for each.
10 F_RPTE_STUDENT_ITEM_RESULTS The F RPTE RESULTS fact table contains the The RPTE Proficiency rating exists at
detailed test results for a student. each level within the entity structure.
It is listed as an attribute in the item
11 F_RPTE_STUDENT_PROF_RESULTS The F RPTE RESULTS fact table contains the The RPTE Proficiency rating exists at
detailed test results for a student. Contains the each level within the entity structure.
items counts to which the student provided It is listed as an attribute in the item
correct answers within a proficiency rating level. detail but may be also held as a
dimensional key. The relationship is
also valid for the other subentities.
Objective results are currently not
used for reporting—decide whether to
retain or not for future use. The
actually scores are for
Objective/Proficiency rating.
Accountability will be identified through
relationships with Location. 1 for
campus, 1 for district. No reports have
been defined for Telpas as yet, and
the RPTE reports are no longer valid.
This table will lean towards a
renormalized state until the reports are
defined. There are currently (2004) 4
positions for objectives within each
proficiency level, the 4 objective
scores will be carried on each
proficiency row. Adjustments can be
made later when use is known. The
sequences in the RPTE and TELPAS
results tables will be synchronized with
each other. TELPAS_SEQ provides
the sequence and will be consistent
for a source row across TA, OBJ and
ITEM results tables
12 F_SAT_REASON_STUDENT_RESULTS This table contains Student scores for all SAT
tests. Business Rule(s): All SAT records with a
valid Student ID will be loaded to the data
warehouse after the ODS process has cleaned
up the records using SSN, Name and date of
birth to match and find student IDs. This
generally requires a manual process that often
needs visual inspection to find some of the
students. If the record has a valid student ID, it
will only be excluded if the all test data is
missing. There may be duplicate records, they
will all be loaded using a sequence number to
make record unique.
.
13 F_SDAA_STUDENT_ITEM_RESULTS Contains student’s results for particular Accountability will be identified through
administration of the SDAA. This Fact Table relationships with Location.
contains the Student's response to the Tested Grade will be derived from the
individual test items. source field/column Instructional
18 F_TAKS_STUDENT_ITEM_RESULTS The TAKS STUDENT RESULTS fact table 1) District_Cnt and Campus_Cnt are
contains the detailed item test results for a determined by the Fall PEIMS
student. This table holds the students COUNTY-DISTRICT-CAMPUS
responses to individual assessment items. If NUMBER versus the Student's current
the response if correct it is flagged so. COUNTY-DISTRICT-CAMPUS
NUMBER.
2) Each subsequent item number
(ITEM 01, ITEM 02 ...) will contain
Item numbers that may or may not
correspond with the column number.
28 FH_STUDENT_HISTORY The student history table acts as a slowly Since this table has dimensional
changing dimension for a particular student. functionality in the model, it will carry
This table tracks key changes of critical student it's own independent primary key
information used in district and campus (surrogate). A unique key consisting
reporting. of a subset of the foreign keys will be
selected (student/calendar date is the
current expectation).
7 FS_TEJAS_LEE_RESULTS
2 RISD_OP_PARMS Contains target scores used in calculating SAT The parameter type will determine
metrics. which value column is populated.
Contains target scores used in calculating AP
metrics.
Contains target scores used in calculating ACT
4 DW_CODES_TRANSFORM
6 DW_ETL_PROCESS
The DDL to create DW base tables was created from Oracle Designer.
The DDL was used to create the data warehouse physical tables. The
data warehouse and ODS metadata was imported into Oracle
Warehouse Builder (OWB) for use in mapping source to target elements.
Importing the metadata saves times and avoids typing errors.
Below is sample DDL that has been included in this document for
illustration purposes.
-- C:\Projects\Roy\dwcutX\DWRISD.tab
--
-- Generated for Oracle 10g on Mon May 02 17:02:03 2004 by Server
Approvals:
RISD Technical Architecture
Document Control
I. Change Record
III Distribution
1 Library Copy
2
Contents
Document Control...............................................................................2
I. Change Record............................................................................2
II Reviewers .....................................................................................2
III Distribution...................................................................................2
1.0 Introduction
1.1 Background
The RISD Data Warehouse environment will provide decision makers throughout the District with
information to help improve student achievement. Users will access the system via Windows based
Internet-capable computers to accommodate the needs of both computer novices and experts. Data
that is currently processed independently in multiple “applications” will be integrated into a single
environment to facilitate data reporting and analysis across test areas.
Discoverer report data will be available online and can be downloaded into local applications where
appropriate (for example, spreadsheets and PC databases) to perform additional analysis or for
integration with local data. Phase I will primarily focus on student assessment data.
User groups will be restricted to accessing information associated with their responsibilities, needs,
and skills. The majority of end users will run parameter driven reports to obtain multiple views of the
assessment data. Dashboard reports will provide summarized tabular and/or graphical information to
1.2 Purpose
The Technical Architecture Document describes all Development, Test and Production components,
including the hardware, platform, software versions, set up configurations and development
repositories. This document defines the RISD Data Warehouse architecture including the Extract,
Transform, and Load (ETL) tools needed to implement the defined architecture, along with the tools
needed to access and support the environment. The architecture must support requirements
gathered through interviews and work sessions with the technology staff and the review of existing
and proposed technical and infrastructure documentation. This analysis provides RISD with a clearly
defined guide to the data warehouse solution from both a business and technical standpoint that can
be incrementally implemented.
In this section, three alternative hardware configuration approaches are discussed. RISD uses HP
hardware and the hardware platform has been selected. However, this section is included for
general information.
1. Low Cost Solaris Open Source Multi-Machine Data Warehouse Solution - Multiple smaller
commodity machines using off-the-shelf Intel devices possibly connected to a Storage Area
Network (SAN)
2. Multi-Machine Data Warehouse Solution - Multiple smaller machines used in an Oracle 10g
Real Application Cluster (RAC) environment possibly connected to a Storage Area Network
(SAN)
Assumptions:
2.4 Conclusion
The total cost of ownership, which includes maintenance and support, available resources, and the
need to incorporate additional applications in the future drove the RISD solution to select the multi-
machine data warehouse solution.
The following outlines the RISD production environment
Environment Description
Database Server • 2 HP DL 585 Processors at 2.6 GHZ
• 4 CPU per processor
• 16 GB of Memory
OID Server • 2 HP DL 385
• 2 CPU
• 2 GB Memory
Server Storage (shared testing • HP EVA 50002
and • 2 HPA (Fiber Channel Connections to HP
production SAN)
ODS – there will be three instances on the single ODS machine – Development, Test, and
Production. Currently, the ODS uses disk drives directly connected to the ODS machine. In the
future, ODS information may be moved to the common SAN devise. Edbert, the name of the ODS
server, has 3.06 GHz w/2,096,666 KB RAM and a 448 GB disk (330 GB of free space).
DW Development – The DW will have its development and test environment on a separate test
machine with attached storage.
• 4 - HP ProLiant DL360 G4 Servers
• All with dual P4 Xeon 3.6GHz/1MB cache processors
• The 2 App servers have 2GB RAM and 2 - 72GB 15k mirrored drives
• The 2 DB servers have 4GB RAM and 2 146GB 10k mirrored drives
Portal Server – The HB25 blade server will have eight slots available for load balancing access to the
new SIS and DW environment.
The following diagram shows a high level representation of the production environment.
This section describes the major components of the proposed Business Intelligence architecture
needed to support RISD business needs. The creation of the data model, which incorporates the data
access requirements, documented in the requirements document drive the physical database design.
The next step in the process identifies source elements and maps them into the data model elements
that will becomes rows in the data warehouse tables (see section 4, Technical Requirements). The
final step captures the Extract, Transform and Load (ETL) processes of the database update design.
The physical architecture needs to support both the backend database processing as well as front-
end access requirements with all associated network links. The final selection was based on several
factors including:
• Total cost of ownership.
• The ability to meet growth projections (Scalability).
• The ability to satisfy system requirements and constraints.
Conversion Files
RIMS ACCESS
PEIMS EDSOFT HR
Database
District District
Assessment Periodic Updates Assessment Team
Team & Summarizations DW
Staging
objectives
Assessment Dashboard
Power User
Assessment Portal Links
General Area (Logo, messages, etc.
Home Page
Assessment Dashboard
Public or District
Employee
Data modeling uses the outputs of the requirements gathering, documented in the requirements
document to answer the following fundamental questions:
The answers to these questions are formulated and encapsulated into a “Data model.” A data model
is an expression of the business objects needed to support priority business processes in graphical
The Oracle Data modeler used Oracle Designer and the inputs from the reporting and security
requirements to create the updated logical graphical entity-relationship diagram (see data model).
The functional model defines the data entities and establishes the relationships between and among
these entities. The updated functional model led to the creation of the Logical Data Model (LDM),
which shows how data from multiple entities can be accessed to provide the necessary data to high
priority reporting and analysis processes. The data modeler ensures that data needed by the
business processes can be accessed through the LDM. The final step is the creation of the Physical
data model where entities are collapsed or expanded into physical tables and foreign key
relationships and constraints are created as needed. The physical model expands the LDM
requirement of being able to access the data to ensure data is physically arranged in the best way to
satisfy the business requirements. The steps needed to create the physical model are important to
understand since the physical data model along with historical data requirements will drive the design
of the technical architecture.
The remainder of this section discusses how Oracle Designer was used to create the physical data
model. The Repository Object Navigator was used to create a new application with the Data
Warehouse Type Property set to ‘Yes’. The data warehouse flag on the application system indicates
to Oracle Designer that validation must be performed on the design to ensure that it conforms to a
star/snowflake model commonly used in a business intelligence design. This flag also ensures that
default values suitable for data warehouse designs are presented when elements are created in the
Repository.
Entities can only be created as either ‘fact’ or ‘dimension’ using the Data Warehouse Type property.
The Database Transformer uses this information to create corresponding fact or dimension tables in
the Repository. For each entity that is implemented as a fact table, the transformer creates a
BITMAP index for every Foreign Key pointing to a dimension table. For each logical model, a Server
Model Diagram is created. In the server model diagram, the tables that will be implemented and the
relationships between and among them are identified. All data objects defined in the RISD
Warehouse Architecture will be modeled in Designer before being physically implemented. This
allows changes to the physical model during the build process as constraints and more detailed
access needs are defined without impacting the overall architecture described in this document.
That is, we have enough information to create the business intelligence architecture without locking
into a final database design.
3.3 ETL Components
Assessments will be loaded into the Operational Data Store (ODS) as files become available. A
matching process will append the RISD student ID to all records. Records that do not match will be
sent to a correction file that will be used by the RISD assessment team to determine the proper
student id. Records that cannot be assigned a RISD id will be set to null.
The data warehouse will have an automated process that will run nightly to process any new
assessment files available in the ODS. The assessment team will also have the ability to initiate the
process immediately. The initial process will run in preliminary mode and only the assessment team
will have the ability to view the data and run reports.
When the assessment team is satisfied with the results, the assessment is set to final, and all users
will have access to the data and the reports.
Student data will be updated nightly in the ODS from RIMS data. The data warehouse will run a
process that will update all records that have changed since the last DW update.
Updating of reference tables will work in a similar manner as the assessment files. Data will be
checked nightly for changes but the assessment team will have the ability to run the process at any
time.
This section documents some of the features that will be used in the implementation of the Oracle
database tables. Oracle9i and 10g has been optimized quite ostensibly for business intelligence
applications. By integrating technologies such as parallel queries, bit map indexes, parallel bit-
mapped star joins, materialized views, and transportable tablespaces into the database engine,
Oracle 10g provides the needed flexibility to meet RISD current and future needs. The choice of
Oracle 10g minimizes implementation risks since Oracle 10g is a fully tested, industrial strength
database engine used in over half of all Data Warehousing solutions.
Data Partitioning
Oracle’s dynamic partitioning provides parallelism for performance. Oracle’s partitioning option
enables data partitions based on clear business value ranges for administrative flexibility and
enhanced query performance through partition elimination. As an example of administrative
flexibility, consider the common data warehousing requirement for “rolling window” operations -
adding new data and removing old data based on time. Oracle’s partitioning will allow RISD to add a
new partition, load and index it in parallel and optionally remove the oldest partition, all without any
impact on existing data in other partitions and with uninterrupted access to that data. The
combination of dynamic parallelism and independent data partitioning will give RISD the ability to use
partitioning for efficient data management without dictating and limiting parallelism. Thus, the
administrative burden of managing partitions that can become unbalanced due to unanticipated
growth is avoided. Hash partitioning is also supported in Oracle and can be implemented in order to
spread data evenly based on a hash algorithm for performance. Hashing may be used in conjunction
with range partitions (composite partitioning) to maintain manageability while increasing performance.
Because of the initial relatively small volumes, data partitioning may not be used.
Further aiding in the creation of such tables is the capability of creating dimensions and hierarchies
within Oracle 10g. These dimensions enable additional query rewrites for summaries and can be
used by OLAP tools. Also, CUBE and ROLLUP operators were added to Oracle8i and expanded in
Oracle 9i and 10g through Business Intelligence (BI) Beans to help reporting and analysis as well as
help developers perform OLAP style aggregation more efficiently.
4.2 Oracle 10g Setup Tasks
1. Determine Disk striping approach based on the Table space sizing information.
2. Confirm recommeneded mirroring approach of using Raid 0+1. (Raid 5 is an option but RISD will
use full mirroring)
3. Standard weekly full backups with daily incredmentals will be available.
4. Obtain and install all the Oracle patches needed for Oracle 10g.
5. Install Oracle instances based on the recommendation in this document.
6. Maintain naming convention standards.
7. Determine the capacity plan and rollback management strategy for the database.
• Determine partitioning strategy
• Compression: Oracle9i Release2 and higher compresses data by eliminating duplicate
values in a database block. Compressed data stored in a database block (a.k.a. disk page) is
self -contained. That is, all database features and functions that work on regular database
blocks also work on compressed database blocks
8. Determine the Oracle database parameters
9. Use Oracle’s Optimal Flexible Architecture (OFA) to setup the Oracle 10g Database layout.
Oracle Corporation recommends that the OFA standard be implemented when installing and
configuring Oracle 10g databases.
RISD will use a single instance for the production data warehouse on a production machine. The test
machine will support both a development and test instance using direct access disks or with a section
of the HP SAN allocated to the separate development, test, and production instances. Oracle
recommends creating a single instance since it is the easiest environment to maintain and the most
cost effective initial configuration.
During the life cycle of the RISD warehouse, these three independent environments can be tightly
coupled by sharing the OWB repository. All new developments and changes to the data warehouse
will be made in the DEV project and tested on the DEV environment first.
The key to maintaining the three environments is to propagate the changes in one environment to
another environment. As an example, when the DEV project is changed in OWB, the updated parts in
the DEV project will be exported to a file and then imported to the TEST project in OWB. Then the
TEST project will capture the changes and apply them to the TEST environment using OEM/CM
(Warehouse Upgrade).
All testing, such as any functional tests, stress tests, and user acceptance tests will be done in the
TEST environment, and errors and issues will be fed back to the DEV environment, where DEV
propagates a new version with the modification to TEST and so on. When the change is accepted in
test, the change is propagated from TEST to PROD.
Many versions of the OWB repository will be needed during the life cycle of RISD warehouse. Each
version will represent a static snap shot of the PROD environment OWB metadata prior to the
application of any modifications. Obviously, there must be a way to keep track of the versions and
restore any version if needed.
OWB uses its Project Archive/Restore Utility by using a file-based approach to store archival
information about the various projects defined within the ETL tool. The metadata in the OWB
repository is saved in the file system and restored from the file system when needed. Thus, the OWB
repository can be well protected by combining OWB and some file-based versioning control system
such as PVCS.
The mapping will be documented in OWB while the business rules will be documented in Oracle
Designer.
4.6 Capacity Plan (Data Storage)
RISD has acquired up to 1.8 TB of usable storage to be shared between the new Student Information
System (SIS) and the Data Warehouse. There is additional space allocated for backups and system
Every enterprise should have a disaster recovery plan to protect its core business in case of a
catastrophe. The disaster recovery plan for the RISD data warehouse must be considered as a
component of the district’s disaster recovery plan.
1. Source to ODS Area: Assessment data will be loaded into the ODS from state provided
assessment files and RISD student data from the RIMS system. Student data will be updated
2. ODS to Data Warehouse: The Data Warehouse will keep atomic detail and summary data that
will grow to 13 years. The warehouse will begin with approximately 3 years of assessment and
student data. The mapping and transformation will be defined in the OWB repository and the
corresponding PL/SQL packages will be generated. OWB can identify the location and the format
of ODS tables, map the fields in the ODS to the columns in the database and then generate the
necessary load scripts and control files for SQL*Loader.
Another type of data enrichment creates derived data attributes and calculation transformations. The
data model has identified the need to carry pre-calculated fields because of their frequency of use.
All of these types of enrichments are performed to enhance query performance for the end user.
Oracle Workflow is an Oracle application designed to manage complex process workflows consisting
of:
Because of the complexity of ETL processes, OWF is introduced and integrated into OWB for
managing the dependency of data warehouse load processes.
• Centralized Console and Management Applications that presents the user interface to
administrators for their management tasks such as scheduling jobs, monitoring of database
performance and resource usage.
• Oracle Management Servers that control communication between the managed nodes and
the OEM management applications. It is also the place where the OEM repository resides.
• Managed Service and Autonomous Intelligence Agents that manage services or targets, such
as databases, application servers, web servers, nodes or applications. The intelligence
agent, a process that runs on each of the system nodes where the managed service resides,
functions as the executor of the jobs and events sent by the Oracle Management Server after
they are received from the consoles and the management applications. Management Server
communicates with the intelligence agents over Net8.
OWB has an interface to OEM for the job scheduling. The OWB jobs will be registered with the
control of OWF to the Oracle Management Servers from OWB instead of the OEM console. Then
intelligent agents on the managed node will perform the job execution and monitoring functions.
OEM is a very powerful tool for database management. Other than job scheduling, it can also be
used for performing other database management functions such as capacity planning, session
monitoring, performance tuning, and change management.
• Tuning Pack is a set of tools to help collect, evaluate and implement tuning changes that
impact database performance. The Tuning Pack helps identify and correct database and
application bottlenecks such as inefficient SQL, poor database structures, and improper use
of database resources. The Tuning Pack’s focus is on the highest impact performance areas
such as Application SQL, Indexing Strategies, Instance configuration, Object sizing,
placement and reorganization. The Oracle Tuning Pack contains the following applications
that will be used in the RISD Warehouse Architecture:
⇒ Oracle Expert which discovers tuning opportunities
⇒ Oracle Index Tuning Wizard which recommends index strategies
⇒ Oracle SQL Analyze which helps tune SQL statements
⇒ Oracle Tablespace Map which helps monitor tablespace usage
⇒ Oracle Reorg Wizard which helps correct space usage problems
• Diagnostics Pack is a set of tools to help the database administrator monitor database
activities and identify problems when something goes wrong in the database. The Oracle
Diagnostic Pack contains the following applications that will be used in the RISD Warehouse
Architecture:
⇒ Oracle Advanced Events for database event audit and alert
⇒ Oracle Performance Manager for showing performance data graphically
⇒ Oracle Capacity Planner for planning future capacity requirements
• Change Management Pack is a set of tools for managing complex changes in the Oracle
Database Server. In the RISD Warehouse Architecture, the Change Management Pack will
monitor the differences between the OWB repository and the physical data warehouse
components and report the differences. Based on these reports, OWB can generate change
code and scripts, which can be applied to the data warehouse without affecting data and
table structures already in the database.
Together with the core OEM functionality, these “packs” will help the data warehouse DBA monitor
and manage the day-to-day operations of the RISD Data Warehouse.
OWB and its sibling products share a common Meta model. Although each product has its own
repository, they are able to share each other’s metadata via the Object Management Group’s (OMG)
Common Warehouse Metadata Interchange (CWMI) specification. This specification establishes the
syntax for communication so that participating vendors can independently integrate their tools at the
metadata level. The CWMI defines a metadata model of generic data-warehouse architecture and
includes the rules for modeling data-warehouse instances.
However, Designer is also helpful in program development where analyst/programming teams can
generate and create applications in Oracle Forms, Oracle Reports and Oracle Web PL/SQL scripts.
All of these objects can be generated from Designer and then executed on Oracle’s Internet
Application Server (iAS) using dynamic HTML (DHTML) and JavaScript. This is not, in itself
applicable to the operation, administration and maintenance of the data warehouse but it may be
useful for the delivery of data.
6.5 Metadata Capture:
In the data warehouse, data is stored in different components such as the Staging Area, ODS, and
Target Data Warehouse. The models are designed by using Oracle Designer. OWB can use those
models as input, and generate transformations based on these database objects. OWB is able to
import the metadata from the Designer through a metadata bridge. OWB can also directly capture
metadata from the Oracle data dictionary and from other Non-Oracle environments if necessary. The
ability to obtain metadata directly from non-Oracle applications may be useful in the future.
6.6 Metadata Update:
Changes to the data warehouse will happen almost as soon as it is built. OWB and Designer are built
to allow changes to the DW structure. If an ETL process needs to change, then the metadata for this
process gets updated in the OWB tool first, and then OWB will generate new DDL, load scripts or the
PL/SQL packages as required. These replacement objects will then be propagated to the data
warehouse.
If a database table or column needs to be changed or added, the change or addition goes to the
Oracle Designer tool first and the change in Designer will be propagated into OWB repository via the
metadata bridge. OWB’s metadata reconciliation capability can analyze the metadata in Oracle
Designer and determine the difference with the earlier imported metadata and synchronize it.
6.7 Metadata Sharing:
The data warehouse architecture involves many different tools and each tool has its own metadata
and repository. Metadata sharing between tools is a feature offered by Oracle for keeping data
consistency and reducing work effort. As already stated, OWB can use Designer’s metadata, and
other Oracle tools such as Discoverer, Darwin, and OLAP server can also get the metadata from
OWB repository. Since each tool has its own repository, the metadata is exported from one tool, and
imported to another tool with the help of the metadata bridge.
When, based on a change request made by the users, the warehouse development team needs to
drop, reconfigure, rename, and upgrade or otherwise modify the data objects within the data
warehouse. Changes usually begin with the logical data model using Designer. Then the metadata
changes in Designer will be propagated into the OWB repository using the metadata bridge.
Next the changes in the logical design will be applied to the physical database instances. There are
several ways to do this:
• In the traditional way, the DBA can manually create the scripts to CREATE/DROP/ALTER
objects, identify the affected objects, recompile the packages, grant the privileges and
• OWB10g provides data warehouse upgrade/drop functionality by integrating with the Change
Management Pack from the OEM.
OWB enables you to directly propagate incremental changes in your logical warehouse design to
your physical instances, without having to drop objects or lose existing data. For example, you may
have tables containing data in your physical instances. If you modify the definition of these tables in
OWB (by adding indexes, changing a constraint, or renaming a column), you can directly reconcile
these changes with the tables in your physical instances using the OWB upgrade feature. Changes
are applied in a manner that preserves the existing rows of data within the tables in the physical
instance.
OWB utilizes the Change Management Pack (CM) features in the Oracle Management Server of
OEM to analyze the existing contents of a target instance and synchronize them with changes in the
OWB repository. OWB uses OEM/CM to compare the code generated from the OWB metadata
against the physical objects in the target warehouse. This integrated technology then generates new
SQL, PL/SQL and DDL scripts to upgrade the target warehouse objects to match the metadata in the
OWB repository.
Before the new scripts get generated and executed, OEM/CM creates an Impact Analysis Report as
an upgrade plan. This plan outlines the detailed actions for the upgrade. After reviewing the Impact
Analysis Report, developers may choose to perform the upgrade and execute the scripts or continue
to change the OWB metadata.
Warehouse
Data Warehouse Metadata Analysis
Analyzed
A data warehouse needs to be backed up on a regular basis to prevent the loss of data and to
reduce the time for recovery should an error occur. The backup of the database is much more
complicated than flat files. Large data volumes usually translate to backup and recovery strategies
that are different from those of online transaction systems. Because the data in the data
warehouse is loaded from other operational source systems, developers often debate whether to
reload from those sources or instead, recover from back-ups when there is a problem. The issue
with reloading data from the source systems is that transactional and more importantly reference
data within the source systems often changes. This means that in these cases the data
warehouse would lose historical integrity if it did not have quality back-ups from which to recover
this lost data.
Without the database archive log, performing backups on a regular basis becomes very important.
Combining the latest data warehouse backups with reloads of source snapshots can satisfy
recovery requirements. If the database crashes, the latest backup needs to be restored and the
warehouse needs to be incrementally reloaded from the point of the last backup. The longer the
interval between backups, the longer the recovery takes. Unfortunately, sometimes not all
database activities that have occurred since the last backup can be recovered using this reload
approach. One option we have is to treat the initial load of the data warehouse and subsequent
incremental loads differently. Turn the archive log off for the initial load, fully backup the
warehouse after the initial load, and then turn the archive log on for incremental loads which
should be a much smaller volume of records and should not negatively impact performance.
While there are many tools in the market for database backup, Oracle’s Recovery Manager
(RMAN) is built into the Oracle 10g server. RMAN is a backup and recovery tool that automates
the backup and recovery procedures with high performance. It is also capable of performing
incremental backups (as long as the re-do and archive logs are turned on during database load
and modification), which backs up only those data blocks that have been changed since the
previous backup.
Archive and restore of purged data will be done by the export/import utility in Oracle 10g. Old data sits on the
data warehouse server’s file system after being exported from the database and is then copied to tape or other
storage media. In general, “purge" deletes old data from current tables and "restore" inserts old data back to
current tables. Deletes and inserts can become very expensive as data volume grows in the data warehouse.
Many issues will have to be addressed, such as the need to maintain indexes, the need for large rollback
segments, the generation of tablespace fragmentation, the need to maintain statistics for the Cost Based
Optimizer, and so on.
Oracle 10g employs its range partitioning capabilities to respond to the foregoing issues. It divides a large table
into smaller partitions and limits operations (insert, delete, export, import) to these local partitions. Historical fact
However, the structure of the table for the partition being purged may change over time. Should a restore be
required in this case, the old data may have a different structure than the current table, so this data cannot be
imported into the current table directly. In order to overcome this issue, a standalone temporary table can be
used to help the archive and restore. During the purge, data in the partition to be purged is exchanged to a
standalone table, then exported and then archived to archive media from there. When restoring, the old data will
be imported into a standalone table. If the table structure is unchanged from when the restored data was
originally purged, the standalone table will be indexed, statistics will be analyzed, and then data will be
exchanged with an empty partition created on the partitioned table for holding the restored data.
If the restored structure has changed, and the changed data is important to the end-users, then the data may
have to be accessed from the standalone table in some ad hoc way. Also, since the old data is loaded into the
standalone table first when restoring, it provides us a chance to prepare the new partition without affecting the
partitioned table. This allows us to make some adjustments to the incoming data, as required, if the original table
structure has changed over time.
ETL is the process of duplicating, cleansing, rationalizing and integrating data from one or more data
sources into the various components of the data warehouse. Obviously, because of the variety of
data sources and data formats that are often involved, this is where most of the effort and resources
are expended during any data warehousing development project.
Custom building ETL routines by hand using a procedural language like C or PL/SQL can be a very
complex and expensive proposition. Hand written ETL routines tend to hide metadata such as data
object definitions, transformations, and business rules in hard to understand programming code. As
in most large scale programming efforts, documentation is incomplete or lags far behind actual
changes to the code. As developers leave, knowledge of these ETL programs leaves with them. New
programmers have to catch-up as best they can with little or no accurate documentation to guide
them.
Instead, experience has shown that in most cases the ETL process can be greatly improved through
Data quality is defined as an agreed upon level of “correctness” and “trust” for data elements within
data base repositories. A major goal of data integration/information management efforts is improving
the “Hygiene” of the data used to make decisions. This, of course, implies that RISD has processes
and procedures in place by which the various components of the organization can come to
agreement on what “quality” data” actually means. Data warehousing can help because, when done
correctly, errors in source systems found during the ETL/cleansing process can be fed back to those
source systems, thereby improving data Hygiene through out the information system data flow over
time. This process is referred to as “Closing the Loop” and is a major benefit of integrating and
managing data more effectively.
• Data Object Metadata describes the physical data and is typically stored in a database
catalog. Developers and database administrators using database tools like SQL access it.
Some of the information categorized as Data Objects Metadata are:
⇒ Database Names
⇒ Table Names
⇒ Column Names
⇒ Object sizes
⇒ Object Data Types (Character, Integer, …)
• Data Movement Metadata describes the movement of data from source to target. Data
movement metadata includes information about the selection and extraction of data,
mapping, transformation, and loading of data. Data movement metadata can be found in ETL
or replication tools or in the logic of the code written to perform the data movement function.
Some of the information categorized as Data Movement Metadata is:
⇒ Source and Target data object definitions
⇒ Mappings between Source and Target objects
⇒ Transformation Rules
⇒ Transformation Restrictions
⇒ Derivation Rules for derived target objects
⇒ Data movement scheduling information
⇒ Data anomalies and resolutions
⇒ Process Dependencies
⇒ Versioning information
⇒ History of prior data movements
• Business Rule Metadata describes how the business operates through the use of its data.
Business Rule metadata describes entity relationships, cardinality, and domain rules that
define the use of data. Business Rule metadata typically exists in data modeling or CASE
tools, or in other forms of documentation maintained outside of a tool: a word processing
document or a spreadsheet. Some of the information categorized as Business Rule Metadata
are:
⇒ The relationship between two entities of data in the logical data model.
⇒ The cardinality between those same entities.
⇒ Valid values for Attributes
• Data Stewardship Metadata describes who in the organization defines the data and is
responsible for those systems or processes that create, maintain, and delete data, and who
consumes the data or directly uses the data or information to do their jobs. . Some of the
information categorized as Data Stewardship Metadata are:
⇒ Organization Information Policy
⇒ Data Security Policy and Rules
⇒ Person or Organization responsible for defining, creating, reading, updating, and
deleting the data
⇒ Data Access Policy and Rules
• System Metadata describes all objects of a system from data files or tables, to programs, to
scripts and jobs, to screens. System metadata is a cross-reference of all of the components
that make up the system and how the components are shared and re-used. Some of the
information categorized as System Metadata are:
⇒ Standard re-usable objects
⇒ Component Descriptions
⇒ Rules for re-use of system objects
⇒ Component test policies and rules
• Data Access / Reporting Metadata describes the data access methods and how data has
been defined to those access methods or tools. Data Access and Reporting metadata may
also describe the steps that must be taken to get authorization to read the data, the
description of how the data can be interpreted, available tools, and descriptions of reports.
Data Access and Reporting metadata typically is found within business intelligence tool
repositories and in traditional types of documentation (i.e. desktop databases, word
processing documents and spreadsheets). Some of the information categorized as Data
Access / Reporting Metadata are:
⇒ Names and descriptions of standard queries and reports
⇒ Queries used to obtain data for reports
⇒ Report-level security policies
⇒ Rules on data display
⇒ Description of data used on reports
⇒ Query or report change history
⇒ Query and report usage and performance statistics
⇒ Query and report scheduling information
• Data Quality Metadata describes the quality of the data. Data Quality metadata describes
the accuracy, confidence level, the change history of the data values and definitions. Some of
the information categorized as Data Quality Metadata are:
⇒ Change history of data objects
⇒ Validity Rules and Policies
This is by no means a complete listing of the metadata that exists, or should exist, within RISD. The
Oracle Warehouse Builder is not merely an ETL tool but it is a tool that allows users to design ETL
processes, and target warehouses and intermediate storage areas. Oracle Warehouse Builder is a
core component of Oracle's Data Warehouse strategy, tightly integrated with the entire stack of
products Oracle offers to customers. A summary of the characteristics and benefits:
KEY FEATURES
Integration
ETL Functionality
Supported Targets
Standards Conformance
• OMG CWM
• Open standard
• Utilizes XML Metadata Interchange (XMI)
• Powerful object model
• Spans spectrum related to ETL and analysis
Metadata Management
• Advanced validation framework
• Multiple-User Environment
Supported Sources
• SAP R/3 _ Oracle
• Flat Files
• ODBC
• DB2, Sybase, Informix, SQL Server (via Oracle Transparent Gateways)
• Mainframe (with Oracle Pure Extract) Reporting
• Support for multiple Portlets within the Warehouse Builder Browser component
• Metadata Impact Analysis Reporting
• Metadata Lineage Reporting
• Portlet based technology
• Secure framework
• Customizable environment
• Public views on both design time and runtime environment
Life-Cycle management
• Source metadata Reconcile:
• Re-import existing source objects
• Reconcile with current definitions
• Impact analysis
• Warehouse Upgrade
• Stand alone Change Management pack
• Create/Drop/Add/Rename Objects
• Impact Analysis report
• Generate upgrade scripts
• Store intermediate data if required for change
HTTP(80)
SSL(443)
SSL (443)
Load Balancer
oidp.risd.org
Instance 1 Instance 2
Load Balancer
Load Balancer
oidp.risd.org
Instance 1 Instance 2
Fully redundant
Oracle Internet Directory and Portal share the same Metadata Repository (MR) database.
The MR database is a RAC database and it has two OID Instances.
A load balancer load balances LDAPrequests from clients.
SSO and Discoverer are SSL-enabled, while Portal is not.
SSO, Portal and Discoverer middle-tiers reside in the DMZ, while OID and the RAC database reside behind
the firewall.
Load Balancer
Load Balancer
oidp.risd.org
Instance 1 Instance 2
Approvals:
RISD ETL Test Results
1 Document Control
1.3 Distribution
Name Position
System Library
Contents
1 DOCUMENT CONTROL ...................................................................................... II
1.1 Change Record ................................................................................................ ii
1.2 Reviewers ......................................................................................................... ii
1.3 Distribution ........................................................................................................ ii
2 INTRODUCTION......................................................................................................1
2.1 Background........................................................................................................1
2.2 Purpose ..............................................................................................................1
2.3 Related Documents ..........................................................................................2
3 TEST CASE SCENARIO........................................................................................3
RISD ETL Test District Confidential - For internal use only iii
RISD ETL Test Results
2 Introduction
2.1 Background
The RISD Data Warehouse environment will provide decision makers throughout the District with information to help improve
student achievement. Users will access the system via Windows based Internet-capable computers to accommodate the needs of
both computer novices and experts. Data that is currently processed independently in multiple “applications” will be integrated into
a single environment to facilitate data reporting and analysis across test areas. All data in the data warehouse will be sourced
from the Operational Data Store (ODS) including assessment information, assessment standards, and student information.
Discoverer report data will be available online and can be downloaded into local applications where appropriate (for example,
spreadsheets and PC databases) to perform additional analysis or for integration with local data. Phase I will primarily focus on
student assessment data.
User groups will be restricted to accessing information associated with their responsibilities, needs, and skills. The majority of end
users will run parameter driven reports to obtain multiple views of the assessment data. Dashboard reports will provide
summarized tabular and/or graphical information to various user groups. The system is designed so RISD can develop relevant
reports as new data becomes available. Some power users will go directly to Discoverer Plus to run reports and create new
reports on an “as needed” basis while most users will use Discoverer viewer or a restricted version of Discoverer Plus to run the
parameter driven reports and/or view static standard reports.
2.2 Purpose
This document defines the Extract, Transform, and Load (ETL) test plan and shows the expected and actual results if different
from the expected results. All data will be sourced from the RISD Operational Data Store. The system and user acceptance test
will use actual data from the ODS. To ensure thorough testing, some unit tests will include conditions that must be coded for but
may not appear in the live data. For example, the unit test will include a test for a RISD student ID on the assessment file that
does not have a corresponding student record. While this condition theoretically cannot happen, it must be tested.
This document is used to gain confirmation that the ETL process is working as expected. Please note this document validates
ETL functionality. See the production readiness document for information on production procedures.
Student Data
Student data will be updated nightly from the ODS Student Information System (SIS) tables. All data changed since the last update will be
updated in the DW.
Other reference, Assessment Standard data, and employee security data
Employee updates will be included in the daily update process. Every night an automated process will run against the ODS and the employee
table will be reloaded.
Test Run Test case execution will be evaluated using the following pre-determined acceptance criteria:
Acceptance • All the tests have been fully executed; if steps were not executed, they are identified and the reason of non-execution is
clear and approved by the reviewer.
• All the conclusions conform to the expected results. If not, deviations have been referenced in the test case (column
«actual results») and, if applicable, corrective actions have been initiated and documented.
Execution • Live data for up to three years of testing will be used to ensure most if not all test conditions will be available.
Instructions • Special condition testing will occur during unit testing. That is, conditions not expected in “live” data will be contrived in the
unit test environment to ensure the process works correctly.
• For each expected result, the conclusion is either “Pass” or “Fail”; nothing else is expected.
There are three phases to the testing process: Unit test, system test, and integrated (user acceptance) test. The unit test will start with a snapshot of data
provided in the ODS. A snapshot of ODS data will be copied to the DW machine and used for unit testing. Some conditions will need to be added to the
“live” data to ensure all scenarios are addressed. For example, some records will be changed to have invalid RISD student IDS to make sure the record
causes the process to abend (referential integrity check).
System will obtain test data from the ODS test environment. For the system test, the data warehouse will begin with empty tables and all ODS data will
be loaded into the DW environment. RISD student information will be updated daily in the production environment and the mechanized daily update
process will be tested during the integrated system test. NOTE – Final acceptance part of production readiness
Step N° Condition Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
1 ODS input assessment has a Process abends (unit test forces Abend verified during unit Pass
RISD Student ID for a student a bad record, this condition testing – part of production Fail
that is not in the DW student should not occur in production readiness
tables and is a fatal error)
2 ODS input assessment has a RISD student is added to the Verified in multiple files Pass
RISD student ID with a null DW with a default student ID and Fail
value assessment record is added to
the assessment tables and the
corresponding assessment
dimensions.
3 ODS assessment file has valid Assessment record is added to Verified in multiple files Pass
RISD student IDS that have the DW with corresponding Fail
student records in the DW assessment dimensions.
4 The assessment file has All records with invalid data are Verified in multiple files Pass Data in the ODS is
invalid values for data fields added to the data warehouse but Fail loaded “as is”. Data that
that are not key fields invalid values are defaulted to is suspect is added as
unknown. varchar
5 Load TAKS data as final - Fact data is loaded from the Tested in all loads Pass
Select a year and TAKS data into the fact tables Fail
administration.
6 Load Dimensions associated Load properly populates Tested in all loads Pass D_OBJECTIVE
with Assessment Dimension Tables Fail D_OBJECTIVE_ITEM
D_STUDENT_TAKS
7 Load Load properly populates item Tested in all loads Pass Dependent on objective
F_TAKS_STUDENT_ITEM_ fact tables Fail and objective item keys
RESULTS being correct in the ODS
Step N° Condition Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
8 Load Load properly populates Tested in all loads Pass Dependent on objective
F_TAKS_STUDENT_OBJ_R objective fact tables Fail and objective item keys
ESULTS being correct in the ODS
9 Load Load properly populates test Tested in all loads Pass
F_TAKS_STUDENT_TA_RE area fact tables Fail
SULTS
Step N° Execution Steps Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
1 ODS input assessment has a Process abends (unit test forces Abend verified during unit Pass
RISD Student ID for a student a bad record, this condition testing – part of production Fail
that is not in the DW student should not occur in production) readiness
tables
2 ODS input assessment has a RISD student is added to the Verified in multiple files Pass
RISD student ID with a null DW with a default student ID and Fail
value assessment record is added to
the assessment tables and the
corresponding assessment
dimensions.
3 ODS assessment file has valid Assessment record is added to Verified in multiple files Pass
RISD student IDS that have the DW with corresponding Fail
student records in the DW assessment dimensions.
Step N° Execution Steps Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
4 The assessment file has invalid All records with invalid data are Verified in multiple files Pass
values for data fields that are not added to the data warehouse but
Fail
key fields invalid values are defaulted to
unknown.
5 Load Load properly populates fact Tested in all loads Pass Dependent on objective
F_TEKS_STUDENT_ITEM_ tables Fail and objective item keys
RESULTS being correct in the ODS
6 Load Load properly populates fact Tested in all loads Pass Dependent on objective
F_TEKS_STUDENT_OBJ_R tables Fail and objective item keys
ESULTS being correct in the ODS
7 Load Load properly populates fact Tested in all loads Pass
F_TEKS_STUDENT_TA_RE tables Fail
SULTS
8 Load Dimensions associated Load properly populates Tested in all loads Pass D_OBJECTIVE
with Assessment Dimension Tables Fail D_OBJECTIVE_ITEM
D_STUDENT_TEKS
Step N° Execution Steps Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
1 ODS input assessment has a Process abends (unit test forces Abend verified during unit Pass
RISD Student ID for a student a bad record, this condition testing – part of production Fail
that is not in the DW student should not occur in production) readiness
tables
Step N° Execution Steps Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
2 ODS input assessment has a RISD student is added to the Verified in multiple files Pass
RISD student ID with a null DW with a default student ID and Fail
value assessment record is added to
the assessment tables and the
corresponding assessment
dimensions.
3 ODS assessment file has valid Assessment record is added to Verified in multiple files Pass
RISD student IDS that have the DW with corresponding Fail
student records in the DW assessment dimensions.
4 The assessment file has invalid All records with invalid data are Verified in multiple files Pass
values for data fields that are not added to the data warehouse but
Fail
key fields invalid values are defaulted to
unknown.
5 Load Load properly populates fact Tested in all loads Pass Dependent on objective
F_DSAA_STUDENT_ITEM_ tables Fail and objective item keys
RESULTS being correct in the ODS
6 Load Load properly populates fact Tested in all loads Pass Dependent on objective
F_SDAA_STUDENT_OBJ_R tables Fail and objective item keys
ESULTS being correct in the ODS
7 Load Load properly populates fact Tested in all loads Pass Used is SDAA II report
F_SDAA_STUDENT_TA_RE tables Fail
SULTS
8 Load Dimensions associated Load properly populates Tested in all loads Pass D_OBJECTIVE
with Assessment Dimension Tables Fail D_OBJECTIVE_ITEM
D_STUDENT_SDAA
Step N° Execution Steps Expected Results Actual Results if Different Conclusion Tester’s Comments
than Expected Results
1 DW_ASMT_BATCH Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
2 DW_CODES Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
3 DW_CODES_TRANSFORM Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
4 DW_ETL_CONTROL Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
5 DW_ETL_PROCESS Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
6 DW_FACT_AUDIT Internal DW table loaded Tables used in Processing Pass Part of production
Fail readiness
7 RISD_ACCOUNTABILITY_M Minimum size for accountability Tables used in Processing Pass Part of production
IN_SIZE populated readiness
Fail
8 RISD_OP_PARMS Report Metrics are properly Tables used in Processing Pass Part of production
loaded Fail readiness
1. Objective and objective items Manually corrected during testing but an update process will
keys in the ODS match tests be created by RISD. Lead to delays in verifying results
2. Mechanization of manual ODS Data manually verified during conversion. RISD should
processes. mechanize update process as opposed to loading
spreadsheet directly into the ODS
1 SAT assessments contain When matching RISD graduates, only one record with the
duplicate records highest score is considered in the materialized view
2 Dummy campuses need Added dummy campuses for (RISD defined district but unknown
calendars to set school year campus) and (Unknown District but unknown campus ) Added
when loading assessments. campus 145 to calendar
3 School calendar does not For assessments, school year is considered August – July for
address July and August conversion. Calendar must be updated in the ODS to reflect the
full year.
4 ERWA and Tejas Lee passing Assessment team decided that the information is not critical
indicators are not set because since the tests will not be used going forward. Raw data from the
test benchmarks are not assessment file is available.
available
Approvals:
RISD Discoverer Test Results
1 Document Control
1.3 Distribution
Name Position
System Library
Note To Holders:
If you receive an electronic copy of this document and print it out, please write
your name on the equivalent of the cover page, for document control
purposes.
If you receive a hard copy of this document, please write your name on the
front cover, for document control purposes.
RISD ETL Test Version 1.1 District Confidential - For internal use only ii
RISD Discoverer Test Results
Contents
1 DOCUMENT CONTROL ...................................................................................... II
1.1 Change Record ................................................................................................ ii
1.2 Reviewers ......................................................................................................... ii
1.3 Distribution ........................................................................................................ ii
2 INTRODUCTION......................................................................................................1
2.1 Background........................................................................................................1
2.2 Purpose ..............................................................................................................1
2.3 Related Documents ..........................................................................................2
3 TEST CASE SCENARIO........................................................................................3
RISD ETL Test Version 1.1 District Confidential - For internal use only iii
RISD Discoverer Test Results
2 Introduction
2.1 Background
The RISD Data Warehouse environment will provide decision makers throughout the District with information to help improve
student achievement. Users will access the system via Windows based Internet-capable computers to accommodate the needs of
both computer novices and experts. Data that is currently processed independently in multiple “applications” will be integrated into
a single environment to facilitate data reporting and analysis across test areas. All data in the data warehouse will be sourced
from the Operational Data Store (ODS) that will include assessment information, assessment standards, and student information.
Report data created in Discoverer will be available online and can be downloaded into local applications where appropriate (for
example, spreadsheets and PC databases) to perform additional analysis or for integration with local data. Phase I will primarily
focus on student assessment data.
User groups will be restricted to accessing information associated with their responsibilities, needs, and skills. The majority of end
users will run parameter driven reports to obtain multiple views of the assessment data. Dashboard reports will provide
summarized tabular and/or graphical information to various user groups. The system is designed so RISD can develop relevant
reports as new data becomes available. Some power users will go directly to Discoverer Plus to run reports and create new
reports on an “as needed” basis while most users will use Discoverer viewer or a restricted version of Discoverer Plus to run the
parameter driven reports and/or view static standard reports.
2.2 Purpose
This document defines the Test Results for the Discoverer reports created by Oracle Consulting. The TAKS Objective Summary
by student will be used to test the Virtual Private Database (VPD) security requirements and will be included in the operational
readiness document. All data available in the Data Warehouse will be sourced from the RISD Operational Data Store. Testing will
validate that data is accurately extracted from the data warehouse but does not check whether data was correctly loaded into the
DW. The system and user acceptance test will use actual data from the ODS.
RISD ETL Test Version 1.1 District Confidential - For internal use only 1
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
This document is used to gain confirmation that the reporting process is working as expected.
Note: Reports data must match what is in the database. If the data does not mach expected results, the ETL process must be
reviewed for inconsistencies or to determine why historical reports do not match historical data.
RISD ETL Test Version 1.1 District Confidential - For internal use only 2
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
RISD ETL Test Version 1.1 District Confidential - For internal use only 3
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
The purpose of the system testing of reports is to ensure that the report is extracting data from the data mart according to report specifications and that
the data is presented in the defined report format (within the potential limitations of Oracle Discoverer). The quality of the data matching the input source
system is addressed in the Extract, Transform, and Load (ETL) update process.
This test case execution will be evaluated using the following predetermined acceptance criteria:
• All the tests have been fully executed; if steps were not executed, they are identified and the reason of nonexecution is clear and approved by the
reviewer.
• All the conclusions conform to the expected results. If not, deviations have been referenced in the test case (column «actual results») and, if
applicable, corrective actions have been initiated and documented.
• All the printing generated during the script execution must reference the test case ID/step ID/test run, the test date and the tester initials.
• For each expected result, the conclusion is either “Pass” or “Fail”; nothing else is expected.
System and integration testing will obtain test data from the ODS test environment. For the system test, the data warehouse will begin with empty tables
and all ODS data will be loaded into the DW environment with live conversion data. All data will be static during the system test. RISD student
information will be updated daily but that is part of the integration test with the ODS. NOTE – Final acceptance part of production readiness
TEST PRE-REQUISITES Verified
(Yes/No)
• The report testing assumes that all Oracle tables have been defined and loaded in the test environment. YES
• The tester must have access to the DW in order to validate results (proper loading of data into the DW is tested during ETL
testing.
RISD ETL Test Version 1.1 District Confidential - For internal use only 4
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step N° Condition Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
1.0 1. D_SCHOOL_YEAR - 2004 Selected Parameters Issue with School calendar Pass Training required as
2. D_ADMIN_PERIOD - 0404 display at the top of the corrected in reload to Prod. Added some selections are not
3. D_TEST_AREA - Math Fail
report calendars for dummy campuses logical
4. D_CAMPUS (Location) - All
5. D_TEST_VERSION (TAKS
Version) - All
6. D_EDUCATION_GRAD
E – HS (Tested Grade
Group) - 4
7. D_LEP - all
8. D_ETHICITY - all
9. D_ECONOMIC_DISADV
ANTAGE - all
10. D_SPECIAL_ED - all
11. D_GENDER all
2.0 Verify Total counts Total Count adds up to Pass Match SLC counts for
mastered plus not mastered Fail 2004
3.0 Verify # mastered counts Matches database counts Pass Match SLC counts
for mastered Fail
4.0 Verify # non mastered counts Matches database counts Pass Match SLC counts
for not mastered Fail
5.0 Vary report parameters and Numbers change as per the Note – a special report version is Pass A version of the report
check results as above selection criteria needed when multiple test Fail that accumulates results
administrations are combined and across a school year
the highest individual student score similar to the way SAT
must be taken. results are accumulated
across all administrations
is needed for yearly
results.
RISD ETL Test Version 1.1 District Confidential - For internal use only 5
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
2.0 Check counts for the Total equals sum of Test Area was dropping records for Pass Match SLC counts
selected criteria meeting, not meeting 2004 with results of “?”. Code Fail
adjusted to populate with zero to
ensure failure (state defined all “?”
results as failure until an update is
provided by the state with actual
results
3.0 Verify # meeting counts Matches database counts Pass Match SLC counts
for meeting Fail
4.0 Verify # non meeting Matches database counts Pass Match SLC counts
for not mastered Fail
RISD ETL Test Version 1.1 District Confidential - For internal use only 6
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step N° Condition Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
5.0 Vary the parameters Counts change and match Note – a special report version is Pass A version of the report
database counts needed when multiple test Fail that accumulates results
administrations are combined and across a school year
the highest score for each similar to the way SAT
individual most be taken for the results are accumulated
year. across all administrations
is needed for yearly
results.
2.0 Check counts for the Total equals sum of Pass Match SLC counts
selected criteria meeting, not meeting and Fail
commended.
RISD ETL Test Version 1.1 District Confidential - For internal use only 7
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step N° Condition Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
3.0 Verify # meeting counts Matches database counts Pass Match SLC counts
for meeting Fail
4.0 Verify # non meeting Matches database counts Pass Match SLC counts
for not mastered Fail
5.0 Verify Number Commended Matches database counts Pass Match SLC counts
for not mastered Fail
6.0 Vary the parameters Counts change and match Pass May need summary table
database counts Fail for performance when all
locations are selected
Step Condition Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
1.0 1. D_SCHOOL_YEAR - 2004 Selected Parameters Pass Training required as
2. D_ADMIN_PERIOD - 0405l display at the top of the some selections are not
3. D_TEST_AREA - Mathematics Fail
report. logical – Security restricts
4. D_CAMPUS (Location)
5. D_TEST_VERSION (TAKS access to individual
Version) students in production.
6. D_EDUCATION_GRADE -
7
7. D_LEP - all
8. D_ETHICITY - all
9. D_ECONOMIC_DISADVA
NTAGE - all
10. D_SPECIAL_ED - all
11. D_GENDER - all
12. D_TEACHER - all
RISD ETL Test Version 1.1 District Confidential - For internal use only 8
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step Condition Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
2.0 Vary Parameters (full access) Only students for selected Pass Order groups of students
parameters are on the Fail by number of correct
report answers.
RISD ETL Test Version 1.1 District Confidential - For internal use only 9
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step N° Execution Steps Expected Results Actual Results if Different than Conclusion Tester’s Comments
Expected Results
6.0 Vary the parameters Counts change and mach Pass
database counts Fail
3.7 TELPAS Results by Grade Summary and Proficiency Rating (two reports)
Step N° Execution Steps Expected Results Actual Results if Different than Conclusion Tester’s Signature /
Expected Results Date
1.0 1. D_SCHOOL_YEAR - 2004 Parameters display at the Discoverer cannot create a Pass Counts match 2004
2. D_ADMIN_PERIOD - all top of the page formatted two-page report. The report created by
3. D_CAMPUS (Location) - all Fail
two-page report is separated into assessment team. Issue
4. D_EDUCATION_GRAD
two separate reports. Oracle with Christie McAuliffe
Tested Grade Group -
Elementary reports would need to be used for school not included in a
5. D_ECONOMIC_DISAD specific formatting. separate count explained
VANTAGE - all since the home school
6. D_SPECIAL_ED - all and not current school
7. D_GENDER - all was used in the report.
RISD ETL Test Version 1.1 District Confidential - For internal use only 10
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
Step N° Execution Steps Expected Results Actual Results if Different than Conclusion Tester’s Signature /
Expected Results Date
1. # Students Rated
2.0 Check counts for selected Minor issue with K/1 objectives Pass Counts reflect database
2. % Tested Beginning
parameters 3. % Tested Intermediate needing definitions. Counts.
Fail
4. % Tested Advanced
5. % Tested Advanced High
(new metric)
6. # Student by grade with
Comprehension score
7. Average comprehension
score by grade (Calculation)
8. # Students with Composite
rating
9. % Tested Beginning by
Grade
10. % Tested Intermediate by
Grade
11. % Tested Advanced by
Grade
12. % Tested Advanced High
13. # Students with Prior year
TELPAS record
14. # Progressing one proficiency
level
15. % Progressing one
proficiency level
16. # Progressing two proficiency
level
17. % Progressing two
proficiency level
18. # Progressing three
proficiency level
19. % Progressing three
proficiency level
20. # Progressing at least one
proficiency level
21. % Progressing at least one
proficiency level
3.0 Vary the parameters Counts change and mach Pass Counts reflect data on
database counts Fail the assessment file
RISD ETL Test Version 1.1 District Confidential - For internal use only 11
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
RISD ETL Test Version 1.1 District Confidential - For internal use only 12
Oracle Internal & Oracle Academy Use Only
RISD Discoverer Test Results
1 Some SLC reports had incorrect Obtained correct numbers and verified results
counts
2 Combined results across Need special report using cumulative yearly results. Version of
administrations for TAKS existing report.
reporting where incorrect
3 Administration for July and Corrected logic to include July and August in the school year
August were reported as
unknown
4 No school year for unknown Added school years for campus 145 and unknown campuses in
campuses and campus 145 order to set Year for reporting.
5 No Campus code for graduates When creating the materialized view for SAT and ACT, used last
campus when current campus was null
6 AYP Report Requirements Proposed creation of summary AYP table containing all data
have changed. needed to create the AYP report.
RISD ETL Test Version 1.1 District Confidential - For internal use only 13
Oracle Internal & Oracle Academy Use Only
Data Integration and Standardization
Phase: Production
Module: Operation
Version: 1.2
Last Update:
Author(s)
Name(s) and Title(s) Signature(s) Date
Document Review
Name(s) and Title(s) Signature(s) Date
Document Approval
Name(s) and Title(s) Signature(s) Date
X X
X X
1. INTRODUCTION............................................................................................................................ 6
2.1 FUNCTIONALITY........................................................................................................................... 6
3. ARCHITECTURE ........................................................................................................................... 8
4.2 WORKFLOW................................................................................................................................ 20
5. OPERATIONS.............................................................................................................................. 26
Reference
Number of the Date Change Description Changed by
document
Note: Please review the data dictionary maintained in Oracle Designer. There are special scripts writ-
ten to extract data into an Excel Spreadsheet for better formatting and easy distribution of the data dic-
tionary.
2. BUSINESS ENVIRONMENT
2.1 FUNCTIONALITY
The Data Warehouse reporting and analysis environment assists Roy Independent School District
Assessment report creation is dependent on current ODS reference tables to be available when load-
ing assessment files and it is the responsibility of the district assessment team to schedule ODS up-
dates. The DW ETL control process will access an ODS log file (ASMT_STATS_VIEW) to determine
when a DW process needs to run. For example, a daily “cron” job can run that will read the ODS log file
that may trigger specific ETL processes. All files marked as READY_TO_LOAD in the ODS will be as-
signed a “batch ID” when run in the DW environment. The DW ETL process will update the DW and
mark the assessment file as loaded but it is the responsibility of the district assessment team to review
error logs for warnings or errors from the update processes. Some critical errors will cause the update
process to ‘abend’. In most cases, there will only be “warning” messages and the ETL process will
complete successfully. The Oracle Database manager will only allow numeric values to be placed in
numeric fields and in order to load the record with invalid data, the field with potentially “bad” data must
be defaulted to character. An alpha value in the SAT score that is used in calculations would be de-
faulted to zero. In other cases where there is a finite domain for a specific field, values not in the do-
main of values will be defaulted to unknown. Please note that the update process has no control over
the quality of the data and cannot make value judgments to correct the data.
RIMS ACCESS
PEIMS EDSOFT HR
Database
District District
Assessment Periodic Updates Assessment Team
Team & Summarizations DW
Staging
objectives
Assessment Dashboard
Power User
Assessment Portal Links
General Area (Logo, messages, etc.
Home Page
Public or District
Employee
3.1.3 TABLESPACE
Segment GB
Extent Man- Manage- Re-
Tablespace Description agement ment Status quired
DISCOIDX Discoverer Metadata Indexes PERMANENT LOCAL AUTO 0.5
DISCOTAB Discoverer Metadata Tables PERMANENT LOCAL AUTO 0.5
DSGNIDX Designer Indexes PERMANENT LOCAL AUTO 1
DSGNTAB Designer Tables PERMANENT LOCAL AUTO 1
DSGNTEMP Designer Temporary Space TEMPORARY LOCAL MANUAL 0.5
DWD Data Warehouse Data PERMANENT LOCAL AUTO 16
DWI Data Warehouse Indexes PERMANENT LOCAL AUTO 16
OWBIDX Oracle OWB indexes PERMANENT LOCAL AUTO 0.625
OWBRTIDX Oracle OWB runtime indexes PERMANENT LOCAL AUTO 0.625
OWBRTTAB Oracle OWB runtime tables PERMANENT LOCAL AUTO 0.5
OWBTAB OWB Tables PERMANENT LOCAL AUTO 0.5
Auxiliary space (required by
SYSAUX 10g) PERMANENT LOCAL AUTO 0.5
SYSTEM System tables PERMANENT LOCAL MANUAL 1
TEMP Temporary space TEMPORARY LOCAL MANUAL 25
UNDOTBS1 System area UNDO LOCAL MANUAL 10
USERS User area PERMANENT LOCAL AUTO 0.5
RAC RAC
DW Development and Production – the DW will have its development/ test environment on a separate
test machine with attached storage.
Note: Initially both the ODS and DW will have a combined development/test environment. Therefore,
only two instances will need to be maintained.
Portal Server – The HB25 blade server will have 8 slots available for load balancing access to applica-
Software Requirements
• HP-UX 11 64-bit versions
• Oracle 10.1.2 64-bit version
• HP Software to partition the HP boxes.
• Software for Load balancing.
• Veritas Software for disk striping and mirroring (Database addition with OMS Oracle Managed
Files, support)
• RMAN API to backup software is a separately purchased product provided by the tape vendor.
Some vendor support is provided. Legato Network driver may be provided as part of your standard
contract.
• Oracle Partitioning to partition large tables.
• Java tool – JDK 1.3.1 (Freeware, Required for installing Oracle 10i, available at the HP site)
• OS patches – TBD
1. Determine Disk striping approach based on the Table space sizing information.
2. Confirm recommeneded mirroring approach of using Raid 0+1. (Raid 5 will be used by RISD
instead of mirroring.)
3. Install RAC.
4. Standard weekly full backups with daily incredmentals (or potentially full daily backups) will be
available. Because of low volumes of updates, it may be acceptable to perform a weekly cold
backup and rerun updates to restore the data warehouse.
5. Obtain and install all the Oracle patches needed for Oracle 10g.
6. Install Oracle instances based on the recommendation in this document.
7. Maintain naming convention standards.
8. Determine the capacity plan and rollback management strategy for the database.
Determine partitioning strategy (school year will be used)
Compression: Oracle9i Release2 and higher compresses data by eliminating duplicate values
in a database block. Compressed data stored in a database block (a.k.a. disk page) is self-
contained. That is, all database features and functions that work on regular database blocks
also work on compressed database blocks.
9. Determine the Oracle database parameters.
10. Use Oracle’s Optimal Flexible Architecture (OFA) to setup the Oracle 10g Database layout. Oracle
Corporation recommends that the OFA standard be implemented when installing and configuring
Oracle 10g databases.
3.1.8 CONNECTIVITY
See RISD network infrastructure for more information.
Area Superintendent Yes All students currently in the Identified in the employee file
Area. by employee ID to obtain and
role and area.
Central Admin Yes No Restrictions. Identified in the employee file
by employee ID and role
Central Superintendent Yes No Restrictions. Identified in the employee file
by employee ID and role
School Board No N/A Summary assessment data
marked as final.
Public (future) No N/A Summary assessment data
marked as final
District Assessment Team Yes No Restrictions Identified through employee
ID and role entered through a
mechanized update process
Please note that role based security and VPD will be used. The DBA must be familiar with the security
design. VPD security will be at the student dimension level. Any aggregate data associated with stu-
Version 1.2 RISD DW Application Operations Guide Page 14/40
dent dimension that needs to be available to all users will need to be stored in summary tables (Exam-
ple, portal student summary reports). Please note that the assessment fact tables only contain a surro-
gate key and individual students cannot be identified. Therefore, materialized views can be used
against the assessment fact tables. Please note that security MUST be applied against any tables con-
taining information that can specifically identify a student (including materialized view). Regular views
against a student level table with carry the security features over to the view. It is important to identify
database objects using the suffix _V or _MV to easily identify objects that are straight views versus
objects that are materialized views since materialized views must be considered when changing the
security policy or when new objects are added to the database. Any student level report trying to ac-
cess specific student data will only return data for which the user has access authority. Role based se-
curity will ensure that users will need to access the database via Discoverer Viewer or Discoverer plus.
The District assessment team (including developers) will have full access to the database.
Assumptions:
1. There is at least two Oracle Discoverer Ids available from portal (RISD_PUB and
RISD_DISCO_EUL). There will also be a minimum number of required Ids for direct type connec-
tions (SQL*Plus, toad, and so on) for the Assessment Team and Development.
2. Portal connections will provide the SSO name to the database.
3. The SSO name will be assigned to a RISD employee (or “dummy” employee) and carried in the
employee table.
4. Matching between teachers in the teacher’s table and the employee table will be done via social
security number.
5. Each employee will be assigned one or more Access Level codes. Upon login, the highest level of
access to which an employee is assigned will be selected as that employee’s access privilege. For
example, if a teacher also has campus administrator rights, the employee would be given the
higher campus administrator rights.
6. Users may have multiple records in the employee table if they have access to multiple campuses
and/or have multiple responsibilities. For example, a campus administrator may have responsibili-
ties at multiple campuses.
7. Assessment team, Central employees (Administrators and Superintendent), Developers, and the
Discoverer Administrator will be exempt from additional student security.
8. Access to individual students is provided by a student access list created during the ETL process.
9. Portal will control database access through Discoverer but the added security will ensure no user
who can somehow gain direct access can view unauthorized student data.
Access from Portal - The SSO Name will be looked up in the DISTRICT_PEOPLE table to obtain the
PERSON_ID. The highest Access Level will be selected and assigned to the person. The VPD predi-
cate will be set.
Direct Database Access - The user’s roles are checked for Access Levels. If the user has a role that
corresponds to an Access Level then it is assumed that the user has access to all students (only
Version 1.2 RISD DW Application Operations Guide Page 15/40
selected folks will have direct database access as most users will use Discoverer access). The Roles
then determine the remaining access to the DB.
People granted access to data via Portal will be identified using SSO and must be in the District People
table. They must also be assigned Access Levels. During connection, the person will be confirmed as
an authorized user by checking the SSO name in the District People table. If the person exists he/she
will be granted access to any tables where an individual student cannot be identified. That is, tables
using surrogate keys will not have security applied to them. The user’s access levels will determine the
student data the user has permission to view.
Direct access users may connect to the database using various tools (SQL*PLUS, TOAD, and so on).
These users require Oracle database login IDs. For access to RISD data, they are required to be
granted an access role. These roles determine the user data access levels. Currently, direct access
with an access role is assumed to be exempt from access restrictions and may see any data in the sys-
tem. Should this change in the future and varying levels of access are required, the user must be en-
tered into the District People table with the SSO_NAME column being set to the user login. In addition,
the user will require a location (campus or area). When the user connects, the user’s roles determine
the data access level permitted. If the user has a valid role, the access level then defaults to the Stu-
dent detail level.
Notes:
1. Access levels must be synchronized with responsibilities in the “risd_portal_responsibilities”
table.
2. Access levels (code) must have a role created for direct access with restrictions.
Hardware options were provided in the architecture document. It is likely that the option to have a sin-
gle load balancer will be used to help reduce initial costs. RISD has decided to use a single load bal-
ancer. Implementing this option will be the least expensive alternative but there will be no redundancy.
Oracle Internet Directory and Portal share the same Metadata Repository (MR) database.
The MR database is a RAC database with two OID Instances.
A load balancer load balances LDAP requests from clients.
SSO and Discoverer are SSL-enabled, while Portal is not.
SSO, Portal and Discoverer middle-tiers reside in the DMZ, while OID and the RAC database is
not.
It is critical that objectives and objective items keys provided by the assessment team match the actual
objectives and objective items provided on the assessments for TAKS, TEKS, and SDAA II. See sec-
tion 4.2 Workflow.
Note: It is highly recommended that RISD create a process in the ODS environment to ensure the ob-
jective and objective item keys created by the assessment team and loaded into the ODS match the
actual keys on the assessments for TAKS, TEKS, and SDAA II. A more comprehensive alternative
would be to extract all objectives and objective items from the assessments and create an update
RIMS ACCESS
PEIMS EDSOFT HR
Database
Schedule
1. Corrections
State & Unmatched
Assessments Periodic ODS 2. Mark as ready
Local Student Detail
To load
Loads
District District
Assessment Periodic Updates Assessment Team
Team & Summarizations DW
Staging
objectives
Assessment Dashboard
Power User
Assessment Portal Links
General Area (Logo, messages, etc.
Home Page
Public or District
Employee
1. Self Service signup will be used – Sample code was provided to RISD by Oracle Consulting.
2. Initially, only employees in the HR employee table will be allowed to self-register. However, non-
employees such as consults will be assigned “dummy” employee Ids that will allow them to self-
register.
3. The user ID will be populated in the ODS employee table via a predetermined algorithm. Employee
information is sourced from RISD HR.
4. The process to populate the employee table with the user ID will also map the HR job description
to the Data Warehouse security group. The employee table will be updated daily in both the ODS
and DW.
5. The self-service sign up process will access the ODS table by employee id or user ID. The self-
service sign up process will activate the user ID in the employee table if the person is able to an-
swer selected questions such as date of birth and perhaps social security number.
6. The user ID and password will be maintained in the Oracle Internet directory (OID).
7. RISD must develop a process to change passwords and deactivate user IDs. In addition, a process
to inactivate user IDs for inactive employees must be developed as part of the ODS update proc-
4. PRODUCTION READINESS
This section depicts the ETL process and shows the technical components that need to be checked
after each release of ensure changes can be moved to production.
The daily DW processes checks the ODS ASMT_STATS_VIEW that accesses the following informa-
tion. Only assessments with the READY_TO_LOAD indicator set to “Y” will be considered.
TEST_CODE
Note: ERWA and Tejas Lee (TLEE) will only be loaded for conversion and are not in the normal update
files.
DSET
This code the ODS uses in its processing and provides a unique identifier to all records in a particular
file. The DW processing create a batch ID for the DW using this field
READY_TO_LOAD
Domain is (Y or N). When the assessment team finishes appending RISD student IDs, this field should
be set to Y. When the nightly DW process runs, the DW will pick up all records in the designated as-
sessment area with the source dates that have the indicator set to Y. After the update in the DW, the
DW control file DW_ASMT_BATCH will note that the file has been processed. Once a file is processed,
it cannot be reprocessed without special manual intervention from the RISD technical staff.
ASMT_CODE - Domain is (TAKS, TEKS, SDAA, TELPAS, SAT, PSAT, ERWA, AP, ACT, TLEE,
LDAA)
DSET - The code the ODS uses in its processing and provides a unique identifier to all records in a
particular file.
BATCH_ID – The internal DW batch identifier that indicates a single assessment load for the data
warehouse.
STATUS_CODE – Code is set to “F” for final after the update process completes.
LAST_POST_DATE – This field represents the posting date for when the process completes.
POST_FLAG – Set to N for new when process begins, set to P for prepped when process after
student Ids are validated. Set to “Y” when the process completes
4.2 WORKFLOW
Overview
1. Student updates must occur first during the nightly run. This ensures all RISD students poten-
tially on the assessment tables are available even though it is unlikely a student will be added
the day an assessment file is ready for processing.
2. All employee tables must be loaded daily into the DW.
3. PKS_PRELOADS_MAIN – Updates the FH_ASSESSMENT_ADMINISTRATION table.
Oracle workflow will manage the update processes. Below are the high level processes that will be
controlled from a single master process.
1. Daily student update process – Controls updating Student information from the SIS tables updated
in the ODS on a daily basis. Also updates the employee security table daily. Please not that some
smaller tables are complete reloads to simply the RISD maintenance processes.
2. Periodic update of assessment data – There is a single process that controls the updating of all
assessment and other periodic data updates that will run daily. The process checks the ODS for
assessment files that are “READY to RUN” that have not been processed in the data warehouse.
There will only be one update process for each assessment and upon completion the posting flag
The following diagram shows the SIS and HR daily data interfaces.
F_STUDENT_ACCESS
F_STUDENT_SCHEDULE
FH_STUDENT_HISTORY
RISD_EMPLOYEES
D_DISTRICT_PEOPLE
2. STG_TAKS_HDR
STG_TAKS_READ 3.
STG_TAKS_ITEM
STG_TAKS_ELA
RISD_TAKS_TEST STG_TAKS_MATH
ODS STG_TAKS_SOCSTUDY
STG_TAKS_OBJECTIVE
STG_TAKS_SCIENCE
1. ASMT_STATS_VIEW
STG_TAKS_LDAA
(ODS) 4.
F_TAKS_STUDENT_OBJ_RESULTS
F_TAKS_STUDENT_TA_RESULTS
F_TAKS_STUDENT_ITEM_RESULTS
LDAAUpdate
Process
1. DW_ASMT_BATCH
DW_ASSESSMENT_ADMINISTRATION
1. An automated process runs and reads the ASMT_STATS_VIEW to determine if any assessment
administration files are ready for processing. There is a simple READY_TO_LOAD indicator that is
set to “Y” by the assessment team when the assessment file is ready to load into the DW. If the in-
dicator is set and the DW process has not run the process yet, the process updates the
DW_ASMT_BATCH and DW_ASSESSMENT_ADMINISTRATION with the ASMT_STATS_VIEW
information and information from the RISD_TAKS_TEST to monitor the DW process.
2. RISD_TAKS_TEST data is staged to multiple files. The STG_TAKS_LDAA table is used in the
LDAA update process. The remaining staging tables are specific to test areas (subjects).
RISD_TAKS_TEST and several ODS standards tables (A) are used to update D_OBJECTIVE and
D_OBJECTIVE_ITEM.
3. To simplify the update process, the staging tables in (2) are further separated into staging tables
for objectives and objective items (STG_TAKS_OBJECTIVE and STG_TAKS_ITEM).
4. The staging tables in (3) are used to update the actual student objective and item fact result tables
(F_TAKS_STUDENT_OBJ_RESULTS and F_TAKS_STUDENT_ITEM_RESULTS).
The following diagram shows the SDAA load. Note that there is no test area for social studies or sci-
ence in SDAA.
2. STG_SDAA_HDR
STG_SDAA_READ 3.
STG_SDAA_ITEM
STG_SDAA_ELA
RISD_SDAA_TEST STG_SDAA_MATH
ODS STG_SDAA_WRITE
STG_SDAA_OBJECTIVE
F_SDAA_STUDENT_TA_RESULTS
F_SDAA_STUDENT_ITEM_RESULTS
1. DW_ASMT_BATCH
DW_ASSESSMENT_ADMINISTRATION
RISD_TEKS_TEST 2. STG_TEKS_TEST
3. STG_TEKS_ITEM
ODS
F_TEKS_STUDENT_TA_RESULTS
F_TEKS_STUDENT_ITEM_RESULTS
1. DW_ASMT_BATCH
DW_ASSESSMENT_ADMINISTRATION
The following diagram shows the remainder of the assessment update that do not use staging tables
since they do not have objectives and objective items associated with them (TELPAS, SAT, PSAT, AP,
ERWA, Tejas Lee, ACT). The example shows the SAT update. Please note that some processes in-
clude a post process to create summary files and/or database views to simplify reporting.
2. (A). RISD_ASMT_TEST
RISD_ASMT_PASSING_STANDARD
ODS
2. RISD_SAT_TEST
2. F_SAT_STUDENT_TA_RESULTS
ODS
1. ASMT_STATS_VIEW 1. DW_ASMT_BATCH
(ODS) DW_ASSESSMENT_ADMINISTRATION
2. (A). RISD_ASMT_TEST
RISD_ASMT_PASSING_STANDARD
ODS
2.STAGE_TAKS_LDAA
2. RISD_LDAA_TEST
ODS
1. ASMT_STATS_VIEW 1. DW_ASMT_BATCH
(ODS) DW_ASSESSMENT_ADMINISTRATION
RISD_PEIMS_GRADUATES F_PEIMS-LEAVERS
RISD_PEIMS_FALL_SUBMISSION F_AYP_BASE
The following DW tables are updated manually and are not expected to change.
1. D_ACCESS_LEVEL
2. D_CAMPUS_TYPE
3. D_EDUCATION_GRADE
RISD_ASSESSMENT_TARGETS RISD_OP_PARMS
D_ASSESSMENT
RISD_LOOKUP_CODES D_CAMPUS
D_ASMT_PASSING_STANDARS
RISD_ASMT_CALENDAR D_ASMT_CALENDAR
Many processes will have post processes to create fields to make reporting easier or to improve per-
formance. The following are some examples.
ACT
SAT
5. OPERATIONS
This section defines all the operations that may be carried out for the application.
State Assessments
1. TAKS
2. SAT
3. PSAT
4. ACT
5. AP
6. TELPAS
7. SDAA II
RPTE was loaded into the ODS but not into the DW since TELPAS will replace RPTE and TELPAS
has an RPTE section. SDAA was converted into SDAA II format for years prior to 2004 but many new
SDAA II fields were defaulted. Only 2004 SDAA II data will be loaded at conversion.
Note:
The above process was automated to load the data directly from a CD provided by the state in a true
production environment. A central on-line ODS process loads the data and automatically runs the
match process to append the RISD student Id. The on-line process also allows the assessment team to
run the follow up process to select the correct student Id for closely matched records. The final func-
tionality in the on-line process allows the assessment team to set the READY_TO_LAOD indicator to
“Y”.
1. File is in .csv format, extracted from EdSoft. Last column is usually unpredictably long. You
have to adjust the substr function in loader script to make sure that every row can be loaded.
There are three loader scripts for the three grades.
2. Save the file in the flat file directory c:\datastore\flat_files\erwa\2004\ 2004 ERWA KN
MOY.xls
3. Change loader script: change infile name with the .csv file name and source_date with the in-
coming file name.
4. Run loader script at MS-DOS promp cd datastore\loader\erwa(tejaslee)\ risd_erwa_kn.ctl
5. Check log file to make sure every record is loaded c:\datastore\risd_erwa_kn.log
TEKS: K2Math was converted to TEKS format and merged into the TEKS ODS table. Future
K2Math data will be included in the TEKS file.
risd_act.ctl
risd_ap.ctl
risd_erwa1.ctl
risd_erwa2.ctl
risd_erwa_kn.ctl
risd_k2_math.ctl
PL/SQL procedure is scheduled to run every night in ERP concurrent manager. ODS update employee
program needs to be automated to update employee file in ODS.
Supplementary Files
RISD_ASMT_TEST – Manually loaded from Assessment team spreadsheet into the ODS.
RISD_ASMT_OBJECTIVE_KEYS – Manually loaded from Assessment team spreadsheet into the
ODS for TAKS, TEKS, and SDAA II.
RISD_ASMT_ITEM_KEYS – Manually loaded from Assessment team spreadsheet into the ODS for
TAKS, TEKS, and SDAA II.
RISD_LOOKUP_CODES – Descriptions for codes consolidated from multiple sources into a single ta-
ble.
RISD_ASSESSMENT_TARGETS – Manually created table with targets for ACT, SAT, and AP.
RISD_SCHOOL_CALENDAR – Created from the PEIMS 3 submission and used to obtain total school
days by campus to determine student mobility..
RISD_ASMT_CALENDAR – manually loaded from a spreadsheet provided by the assessment team.
Note: Occasionally, a student on an assessment will be identified after the process is loaded. Since the
student will be loaded into the D_STUDENT table under a dummy ID, the student ID can be updated
with the actual student ID at any time. Since surrogate keys are used, all existing relationships would
continue to exist and the student would be reported properly. A simple form could be developed by
RISD to address this special situation. Of course, a manual update is an option if this is a rare occur-
rence.
The following first letters in the student ID can identify “Dummy” student Ids.
State Assessments – State assessment are provided on CD. Changes to the format will cause
changes to the ODS and DW processes. Please note that the manual conversion process was auto-
mated.
TAKS
SDAA
TELPAS
AP
ACT
SAT
PSAT
Local Assessments From EDSOFT – Changes to the format of input records will cause changes to
the ODS and DW processes. Conversion process needs to be automated. Note – the two files below
will have raw data loaded into the DW from conversion files but no benchmark results have been
loaded since these assessments have been discontinued.
ERWA / TPRI
Tejas Lee
LDAA spreadsheet manually created by the assessment team and loaded into the ODS manually.
This is a once a year process so a manual load may not be worth mechanizing but great care must
be taken to ensure the quality of the data before loading. Please note that additional LDAA data is
provided by the TAKS and SDAA II assessment feeds and are loaded separately in the DW stag-
ing process and loaded into the LDAA assessment area. The SDAA II and TAKS LDAA information
is needed to create the AYP summary tables used to create the AYP reports.
Local Student information – SIS data obtained through a daily interface with RIMS. This process will
be updated in the ODS to obtain student information from the new SIS environment in the 2006 – 2007
school year.
PEIMS Leaver information (RISD_PEIMS_GRADUATES) – The PEIMS leaver file is primarily used to
identify graduates but all students who leave the district with a reason code are available. This file is
updated manually but the update should be automated.
Supplementary Tables – Many supplementary processes are needed to update standard and lookup
information needed by the DW process. These ODS tables are needed by the update processes
documented in an earlier section but do not have stand-alone update processes.
RISD_ASMT_TEST – Manually loaded from Assessment team spreadsheet into the ODS.
RISD_ASMT_OBJECTIVE_KEYS – Manually loaded from Assessment team spreadsheet into the
ODS for TAKS, TEKS, and SDAA II.
RISD_ASMT_ITEM_KEYS – Manually loaded from Assessment team spreadsheet into the ODS for
TAKS, TEKS, and SDAA II.
RISD_LOOKUP_CODES – Descriptions for codes consolidated from multiple sources into a single ta-
ble.
RISD_ASSESSMENT_TARGETS – Manually created table with targets for ACT, SAT, and AP.
RISD_SCHOOL_CALENDAR – Manually loaded from a spreadsheet provided by the assessment
team.
RISD_ASMT_CALENDAR – manually loaded from a spreadsheet provided by the assessment team.
5.3.2 MONITORING
All DW update jobs are scheduled nightly. The ODS log file will be checked to determine what, if any,
assessment updates are needed. Student updates from RIMS and employee updates from HR occur
nightly in the ODS. The DW update process can be dependent on the ODS process or simplify sched-
uled after the ODS process. The DW updates are dependent on new files being available in the ODS.
Student updates are reviewed on a daily basis to keep the ODS and DW in sync.
The data warehouse will have a process that can be automated to run nightly to process any new as-
sessment files available in the ODS. The assessment team will also have the ability to initiate the proc-
ess immediately.
The assessment team assigns student Ids to assessment records and when the assessment team is
satisfied with the results, the assessment is set to READY_TO_LOAD and the assessment is loaded
into the DW and set to final. All users will have access to the data and the reports.
1. Assessment team member begins the on-line process to load the Assessment information when
the assessment CD is available (or the local assessment data is available).
2. Data is loaded into the ODS from the CD (or from a flat file for local assessments) and the auto-
mated match process runs to append the RISD student id.
3. The assessment team reviews the results from the match process and runs the RISD student ID
correction process. Each student id that cannot be fully matched is presented on-line to the as-
sessment team member with multiple potential matches. The assessment team member selects
one of the potential matches, inputs an override student id, or leaves the student Id as unknown
(null values are expected by the DW process for unknown students).
4. After the student Id correction process is complete, the assessment team member can mark the
ODS file as READY_TO_LOAD.
5. Each night, a DW update process will check the ODS log file to see if there are any update proc-
esses to run. The assessment team member also has the ability to run the DW process immedi-
ately. When the assessment process runs, a DW batch ID is assigned to the assessment and ad-
ministration.
6. DW control will not permit the updating of a file marked as final in its control file. If someone
changes the update indicator manually in the ODS file, the DW process will not apply the update
since the internal DW process has the update marked as final.
5.6.2 PROCEDURES
General Procedures
1. Creating a Request. Written confirmation of the business need. In the case of a production
problem, an email to the application support person will suffice.
2. Approving the Request Fixes to production problems only needs approval from the applica-
tion manager as the change usually needs to be done immediately. This approval can be post
implementation in the event of an emergency.
3. Analysis of the Request. The application support person or the developer must determine the
The last step is to update all system documentation, including this application Operations guide (most
changes will be documented in the design tools but new documents may be required.)
“Hot” (or cold) backups are performed nightly and full “cold backups” are sent off site periodically. This document
describes the strategy for backing up this data and recovering it in the event any data is destroyed or corrupted.
The following fully automated database backup processing is performed for the production database:
A full Oracle hot (online) or cold (offline) database backup is performed nightly. This backup is per-
formed under the control of the Oracle Recovery Manager (RMAN) and scheduled via the standard
UNIX scheduler (cron). As data is copied from the database, it is passed to a Legato backup server
running Legato backup software where it is written to tape. Backup tapes are retained for at least 60
days.
If data is physically destroyed (via disk media failure), RMAN is invoked to recover it. First, the dam-
aged database files are restored from the most recent full online database backup. The data is then
recovered to the state it was in the last backup.
The options in the above list are not necessarily listed in order of most desirable to least desirable.
Note that none of these options guarantee full database recovery in all cases but because of the low
volumes, using option 4 and recovering to point in time and rerunning update processes has the lowest
risk.
Executing an application program or process (option 1) is usually the preferred method for recovering
corrupted data because it reduces risk by executing an established procedure for changing records in
database tables. For example, executing a portion of the ODS extract, transform, and load (ETL) proc-
ess may be the best way to recover corrupted data. In particular, data warehouse processes that per-
form full table refreshes of ODS data should normally be used to recover corrupted data. However, for
data corruption problems when the ETL processes uses an incremental change strategy requires a
special run of the DW update process to ensure all corrupt entries are deleted and replaced during a
rerun for the period in question.
Option 2, SQL insert, update, and delete statements may be used to repair data. Query flashback may
be helpful since it can select table data as it existed at a previous point in time. This method requires a
complete understanding of rows affected, correct column values, and inter table relationships.
Recovery of individual corrupted tables (option 3) can be accomplished via individual table point-in-
time recovery. This method requires:
• Identifying when the data was corrupted
• Recovering the production database on the test data warehouse server to a specified point in
time before the corruption occurred
• Exporting all the rows in the recovered corrupted tables and potentially all related tables on the
test server
Recovery of the entire database to a point in time (option 4) is also possible. This method does not re-
quire use of the test database server as in option 3, and the point-in-time recovery step can be accom-
plished with only a few simple RMAN commands. However, it resets all of the tables managed by to
the same point in time and all process run in the interim would have to be reset and rerun. However,
since the RISD database is relatively small and the data needed to catch up several weeks or even
months is available in the ODS, this might be the best alternative in many cases.
See the backup and recovery documentation maintained by the Infrastructure group.
Disaster Recovery
Every enterprise should have a disaster recovery plan to protect its core business in case of a catas-
trophe. The disaster recovery plan for the RISD data warehouse must be considered as a component
of the district’s disaster recovery plan.
Note: Please review OWB documentation for details as many OWB processes contain pre and post
ETL update processing.
3 Try to add a role for an employee that Adds new “employee” as a dummy and sets role.
does not exist.
4 Try to add a role for an employee al- Process rejects record. (This
ready having the role. is part of the ODS update)
15 Validate all new or changed summary Summaries are added for se-
tables and materialized view match the curity and or improve re-
base details sponse time
16 The ETL process is set up to run nightly
but can be started manually. A process
is needed to run the process in real
time.
Info Set Title: Texas Assessments of Knowledge and Skills (TAKS) for State Accountability
On each TAKS test, the critical knowledge and skills are measured by a series of test objectives. These objectives are
not found verbatim in the TEKS curriculum but the objectives are umbrella statements that serve as headings under
which student expectations from the TEKS can be meaningfully grouped. Objectives are broad statements that break up
knowledge and skills to be tested into meaningful subsets around which a test can be organized into reporting units.
These reporting units help RISD employees, parents, and the general public better understand the performance of
students, teachers, and schools as the school year progresses. Test objectives are not intended to be rewordings of the
TEKS but the objectives are designed to be identical across grade levels rather than grade specific. Generally, the test
objectives are the same for third grade through eighth grade (an elementary/middle school system) and for ninth grade
through eleventh grade (a high school system).
Page 1
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Table of Contents
Info Set Title: Texas Assessments of Knowledge and Skills (TAKS) for State Accountability............................ 1
Info Set Description: ..............................................................................................................................................................1
1.0 Aggregate Report Name: TAKS Objectives Summary Report – Report 1A.............................................................4
Report Mockup / Layout: TAKS District Objectives Summary by Student Groups Report - Elementary Example .................... 4
Report Measures & Dimensions: TAKS District Objectives Summary Report – Elementary Example........................................ 5
Dimension Hierarchies for: Report Name: TAKS District Objectives Summary Report – Elementary Example........................ 8
2.0 TAKS Passed, MAO and Commended Detail Report – Report 1B............................................................................9
Report Mockup / Layout: TAKS Passed, MAO and Commended by Detailed Dimensions Report - Elementary Example......... 9
Report Measures & Dimensions: TAKS Passed, MAO and Commended Detail Report - Elementary Example........................ 10
Dimension Hierarchies for: TAKS Passed, MAO and Commended Detail Report - Elementary Example................................. 12
3.0 Aggregate Report Name: TAKS Aggregate Item Analysis Report (Report 2) – Elementary Example ..............13
Report Measures & Dimensions: TAKS Aggregate Item Analysis Report – Elementary Example.............................................. 14
Dimension Hierarchies for: Report Name: TAKS Item Analysis Report – Elementary Example ................................................ 16
Page 2
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
4.0 Detail Report Name: TAKS Objective Summary by Student – Report 3...............................................................17
Report Mockup / Layout: TAKS Performance Level by Student (State Accountability) – Elementary Example ....................... 17
Dimension Hierarchies for: Report Name: TAKS Objective Summary by Student (State Accountability.................................. 19
Page 3
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Mockup / Layout: TAKS District Objectives Summary by Student Groups Report - Elementary Example
Page Dimensions
School Year and Administration Test Area Location
Test Version Grade LEP
Ethnicity Economic Disadvantage Special Ed
180
# % # Not % Not
Total Mastered Mastered Mastered Mastered 160
140
99 71% 116
Obj 1 215 29%
120
98 68% 117 100
Obj 2 215 32%
80
Obj 3 215 97 71% 118 29% 60
Number of Students
40
Obj 4 215 97 72% 118 28%
20
Obj 5 215 154 73% 72 27% 0
Obj 1 Obj 2 Obj 3 Obj 4 Obj 5 Obj 6 Obj 8
Obj 6 215 101 67% 114 33%
Mastered Not Mastered
Obj 8 215 140 78% 75 22%
Note: Clicking on highlighted areas, Total, # Mastered, or Number not Master will link to the detail objectives report by student (4.0)
Page 4
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Measures & Dimensions: TAKS District Objectives Summary Report – Elementary Example
Page 5
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Page 6
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Linkages (s):
You may link to report 4.0 by clicking on Total,
# mastered, and # Not Mastered.
Page 7
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Dimension Hierarchies for: Report Name: TAKS District Objectives Summary Report – Elementary Example
Dimensions Hierarchies shown in the dimension Specific Values, Special Sort Order *,
Notes, Etc.
D_SCHOOL_YEAR and Administration <School Year> Note – Cumulative reports are possible for areas that allow
D_ADMIN_PERIOD (Time) retests although most grade test areas only have one
administration
D_TEST_AREA All (default) – MUST SELECT TEST AREA – Test Areas vary by Administration and include:
ALL MEANS all Reports Reading
ELA/Reading
Mathematics
Writing
Science
Social Studies
D_CAMPUS (Location) All (includes RISD and non RISD campuses)
District (includes unknown RISD campuses)
Area
Campus
D_TEST_VERSION (TAKS Version) Spanish or English
D_EDUCATION_GRADE All Grades limited to 3-6 for Elementary Schools
(Tested Grade Group) Campus Type 3-11 for all (for example, Elementary, Jr. & Senior High)
Grade
D_LEP All
Y/N
D_ETHICITY All
Native American
Asian
African American
Hispanic
White
Other
D_ECONOMIC_DISADVANTAGE All
Y/N
D_SPECIAL_ED All
In Special Ed / Not in Special Ed
D_GENDER All
Male/Female
Page 8
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Mockup / Layout: TAKS Passed, MAO and Commended by Detailed Dimensions Report - Elementary Example
Page Dimensions
School Year Administration Location
Test Version Grade LEP
Ethnicity Economic Disadvantage Special Ed
# # % # Not % Not
Total Meeting Meeting Meeting Meeting #
Standard Standard Standard Standard # MAO % MAO CMND % CMND
Note: Test there are 4 test areas at each level. Elementary – Reading, Math, Writing, and Science. JH – Reading, Math, Writing, and Social Studies.
HS – ELA, Math, Science, and Social Studies.
Page 9
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Measures & Dimensions: TAKS Passed, MAO and Commended Detail Report - Elementary Example
Page 10
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Notes:
Met standards plus not met standards counts
add up to the total students taking the test.
Page 11
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Dimension Hierarchies for: TAKS Passed, MAO and Commended Detail Report - Elementary Example
Page 12
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
3.0 Aggregate Report Name: TAKS Aggregate Item Analysis Report (Report 2) – Elementary Example
Page Dimensions
School Year and Administration Test Area Location
Test Version Grade LEP
Ethnicity Economic Disadvantage Special Ed
Page 13
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Measures & Dimensions: TAKS Aggregate Item Analysis Report – Elementary Example
Page 14
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Linkages (s):
Specify the linkages between this report and
others.
Notes:
All Measures other than Objective and TEKS can
aggregate
Page 15
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Dimension Hierarchies for: Report Name: TAKS Item Analysis Report – Elementary Example
Dimensions Hierarchies shown in the dimension Specific Values, Special Sort Order *,
Notes, Etc.
D_SCHOOL_YEAR and Administration <School Year> Note – Cumulative reports are possible for areas that allow
D_ADMIN_PERIOD (Time) retests although most grade test areas only have one
administration
D_TEST_AREA All (default) Test Areas vary by Administration and include:
Reading
ELA/Reading
Mathematics
Writing
Science
Social Studies
D_CAMPUS (Location) All (includes RISD and non RISD campuses)
District (includes unknown RISD campuses)
Area
Campus
D_TEST_VERSION (TAKS Version) Spanish or English or All
D_EDUCATION_GRADE All Grades limited to 3-6 for Elementary Schools
(Tested Grade Group) Campus Type 3-11 for all (for example, Elementary, Jr. & Senior High)
Grade
D_LEP All
Y/N
D_ETHICITY All
Native American
Asian
African American
Hispanic
White
Other
D_ECONOMIC_DISADVANTAGE All
Y/N
D_SPECIAL_ED All
In Special Ed / Not in Special Ed
D_GENDER All
Male/Female
Page 16
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Report Mockup / Layout: TAKS Performance Level by Student (State Accountability) – Elementary Example
Page Dimensions
School Year and Administration Test Area Location
Test Version Grade LEP
Ethnicity Economic Disadvantage Special Ed
Teacher Course - enhancement Period - enhancement
Parameters are set from Linked Report. Can set a % correct range to filter the students and call this report directly from portal if desired.
Report Measures & Dimensions: TAKS Objective Summary by Student (State Accountability) – Elementary Example
Order
Campus Test Raw Score Score Score Score Score %
Code Level StudID Student Name Score Obj. 1 Obj. 2 Obj. 3 Obj. 4 Obj. 5 Correct
1 BSE 3 ID 10001 Student 1 36 15 7 6 8 15 100.0%
2 NPR 3 ID 10002 Student 2 36 15 7 6 8 15 100.0%
3 STR 3 ID 10003 Student 3 36 15 7 6 8 15 100.0%
4 BSE 3 ID 10004 Student 4 29 12 7 6 4 12 80.6%
5 NPR 3 ID 10005 Student 5 31 15 6 6 4 15 86.1%
6 STR 3 ID 10006 Student 6 24 10 4 3 7 10 66.7%
7 BSE 3 ID 10007 Student 7 29 11 7 5 6 11 80.6%
8 NPR 3 ID 10008 Student 8 26 11 5 4 6 11 72.2%
9 STR 3 ID 10009 Student 9 25 12 4 4 5 12 69.4%
10 BSE 3 ID 10010 Student 10 21 10 2 5 4 10 58.3%
11 NPR 3 ID 10011 Student 11 22 8 5 3 6 8 61.1%
12 STR 3 ID 10012 Student 12 21 7 4 5 5 7 58.3%
13 STR 3 ID 10013 Student 13 24 10 5 3 6 10 66.7%
Page 17
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Page 18
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Dimension Hierarchies for: Report Name: TAKS Objective Summary by Student (State Accountability
Dimensions Hierarchies shown in the dimension Specific Values, Special Sort Order *,
Notes, Etc.
D_SCHOOL_YEAR and Administration <School Year> Note – Cumulative reports are possible for areas that allow
D_ADMIN_PERIOD (Time) retests although most grade test areas only have one
administration
D_TEST_AREA All (default) Test Areas vary by Administration and include:
Reading
ELA/Reading
Mathematics
Writing
Science
Social Studies
D_CAMPUS (Location) All (includes RISD and non RISD campuses)
District (includes unknown RISD campuses)
Area
Campus
D_TEST_VERSION (TAKS Version) Spanish or English
D_EDUCATION_GRADE All Grades limited to 3-6 for Elementary Schools
(Tested Grade Group) Campus Type 3-11 for all (for example, Elementary, Jr. & Senior High)
Grade
D_LEP All
Y/N
D_ETHICITY All
Native American
Asian
African American
Hispanic
White
Other
D_ECONOMIC_DISADVANTAGE All
Y/N
D_SPECIAL_ED All
In Special Ed / Not in Special Ed
D_GENDER All
Male/Female
Page 19
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
01 x x x The TAKS assessment information may be used for both State Assessment reports and for Local reports. The
State Assessment reports must match state reports published by Texas Education Agency (TEA). The Local
reports integrate the TAKS assessment information with the RISD SIS information to provide more
accurate/current information. The data warehouse will need to capture the student demographics at assessment
time from both the data file returned by the State and the RISD SIS. All SIS data changes (except corrections)
will be captured to enable point in time local reporting.
02 x x x State Accountability – Location & District
Note the accountability indicator will be at the student level. Each student will have district accountability and
a campus accountability indicator.
If the Test Area has multiple administrations (Reading Grade 3 & 5, Math Grade 5) the student records in
multiple administrations must match values for all administrations match (use rules above).
03 x x x Local Reporting - Location & District
Report a student at the location where the student was enrolled when the TAKS test was administered - Tested
Location. Recaptured students, will be reported from “Other District Location”.
04 x x x State Accountability / Local Reporting
Page 20
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
All State Accountability Reports must match exactly what is done by the TEA. This means all student
demographics are, as returned by the State and that all calculations and data must come from a specific set of
State files (records or documents not including exit retests past June).
Local Reporting may use, students excluded from State Accountability reports because they transferred into or
out of the District, local (SIS) demographics, local data (SIS Enrollments) and exit retests past June.
For SLC the objectives are to provide RISD leadership with a "preview" of what the State Federal (NCLB)
Accountability reports will be and to provide more detailed diagnostic information for proactive management.
How this is implemented, is to use the State Accountability sources, demographics and calculations with one
primary addition "# Eligible for TAKS" that comes from SIS enrollment counts on the day of testing. The "#
Eligible for TAKS" is used as the denominator in many of the percentages; % absent, % LEP Exempt, etc. For
State Accountability they use for "# Eligible for the number of unique state test documents submitted. Use the
demographics reported on the specific Test Area record provided by the state to ensure the results match the
State reports. Note: A student could have different demographics on different Test Areas, for example, Male on
ELA female on Math. If the Test Area has multiple administrations (Reading Grade 3 & 5, Math Grade 5) use
the values for last administration since that will be the one the student passes or it will be the last one counted
towards the report if they fail. Only failing students have multiple records.
05 x x x State Accountability - Multiple Scan Sheets – for example, Duplicates
Basic Rule for Writing, Science and Math and Reading with only a single administration. If the student has
more than one record for the Test Area, Tested Grade and year add a sequence number to make it unique and
load. Must include all records. When loading records, create an exception report showing duplicates.
For Test Areas and Tested Grades which have multiple administrations (Reading Grade 3 & 5, Math Grade 5)
Load all records with Score Code = S
For the first administration report only Score Code = S records from that administration
For the second administration report Score Code = S records from the second administration and Score Code =
S records from the first administration if there isn’t a Score Code = S record in the second administration for the
Page 21
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
06 x Mastering Objectives – The minimum or standard for mastering an objective on TAKS is 70% of the items
correct within an objective. This standard has been set by RISD, not TEA. The 2004 TAKS Mastery levels
are defined below - Source SLC data dictionary. The Minimum for Mastery & Total Possible Points vary by,
Test Area, Tested Grade and Year. Note – This information must be entered into the Assessment benchmark
table. The information below is examples.
07 x It was noted that raw score = the number of items correct except for writing where some responses are weighed.
08 x There is a Locally Developed Alternative Assessment (LDAA) section on a TAKS record. Students with
LDAA values indicate the students are being given a LDAA assessment and will not have TAKS information
for that subject area.
09 There is an English and Spanish version of TAKS reports. Students may take the English version in one test
area and the Spanish version in another test area.
10 It was noted that for tested grades 3, 5, and in the future 8, students may take a test area more than once since it
is a criteria for passing but other testing grades only take the test annually.
11 x Test Areas Tested Grade combinations have up to 10 Objectives but we will allow up to 15. Not all Test Areas
Tested Grade combinations have 10 Objectives. The Objective will be set to Null if the Test Areas Tested
Grade combination doesn’t include the Objective.
12 x x The State Accountability State Test Reporting Administrations are summarized below for information purposes
Page 22
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
only. The ODS production loading of data will control how data is loaded and made available to the Data
Warehouse Times are approximates.
March - TAKS Tests are administered in February and are returned in March, to allow time to retest students
for promotion to the next grade and before the final accountability results are determined. Includes
TAKS Reading – Grade 3 & 5.
April - TAKS Tests are administered in early April and are returned in late April, to allow time to retest
students for promotion to the next grade before the final accountability results are determined. Includes:
TAKS Math Grade 5.
May – This includes tests administered in several different months but all returned in early May. Includes:
TAKS Reading – Grade 3 & 5 Which is a combination of the Tests administered in Feb and the April
retests.
SDAA Reading – Grade 3 & 5 administered in April
TAKS Reading – Grade 4 & 6-8 administered in April
SDAA Reading – Grade 4 & 6-8 administered in April
TAKS Reading – Grade 9 administered in February
SDAA Reading – Grade 9 administered in February
TAKS ELA – Grade 10 & 11 administered in February
SDAA ELA – Grade 10 administered in February
TAKS Writing – Grade 4 & 7 administered in February
SDAA Writing – Grade 4 & 7 administered in February
RPTE/TELPAS – Grades 3-12 administered in March/April
TAKS Math – Grades 3, 4 & 6-11 administered in April
SDAA Math – Grades 3, 4 & 6-11 administered in April
TAKS Science – Grades 5, 10 &11 administered in April
SDAA Science – Grades 5, 10 &11 administered in April
TAKS Social Studies – Grades 8, 10 &11 administered in April
SDAA Social Studies – Grades 8, 10 &11 administered in April
Page 23
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
June - This includes all the results that will be used by the state to determine accountability ratings. This is a
combination of the tests administered in several different months and reported in March, April, May and the
following retests:
TAKS Math Grade 5 administered in May
– grade 3,5 Final test (does not count towards accountability only promotion), Grade 5 Math
October<School Year>
Feb-March <School Year>
April <School Year>
May <School Year>
July <School Year>
March - TAKS Tests are administered in February and are returned in March, to allow time to retest students
for promotion to the next grade and before the final accountability results are determined. Includes
TAKS Reading – Grade 3 & 5.
April - TAKS Tests are administered in early April and are returned in late April, to allow time to retest
students for promotion to the next grade before the final accountability results are determined. Includes:
TAKS Math Grade 5.
May – This includes tests administered in several different months but all returned in early May. Includes:
TAKS Reading – Grade 3 & 5 Which is a combination of the Tests administered in Feb and the April
retests.
SDAA Reading – Grade 3 & 5 administered in April
TAKS Reading – Grade 4 & 6-8 administered in April
Page 24
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
June - This includes all the results that will be used by the state to determine accountability ratings. This is a
combination of the tests administered in several different months and reported in March, April, May and the
following retests:
TAKS Math Grade 5 administered in May
July – TAKS Retests are administered in July and returned in August for 3 & 5 Graders as a last test after
summer school to allow the students to be promoted to the next grade level.
TAKS Reading – Grade 3 & 5
TAKS Math – Grade 5
July – TAKS Exit tests (special case of retest) are administered in August and returned in August for 11th
Graders to have a chance after summer school to pass the exit test for graduation.
TAKS ELA – Grade 11
TAKS Math – Grade 11
Page 25
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Recaptured - In ~ Sept/Oct of the next year the District gets TAKS results for new students to the District who
took TAKS in another District. Recaptured students are not included in the District’s State Accountability
results or reports. The recaptured data is used for local reporting. They are associated with the location where
the student is enrolled in the district. They are primarily used for measuring student progress over time.
Next Oct – TAKS Exit tests (special case of retest) are administered in October and returned in October for 12
Graders who have not passed, as another chance to pass the exit test for graduation after finishing High School.
TAKS ELA – Grade 11
TAKS Math – Grade 11
TAKS Science – Grades 11
TAKS Social Studies- Grade 11
Next Jan – TAKS Exit tests (special case of retest) are administered in January and returned in January for 12th
Graders who have not passed, as another chance to pass the exit test for graduation.
TAKS ELA – Grade 11
TAKS Math – Grade 11
TAKS Science – Grades 11
TAKS Social Studies- Grade 11
14 X x x It takes some time to completely resolve all data issues for each test administration. The preliminary results
need to be published to a limited subset of users until the data is finalized. Once the data is finalized, the
reports can be published for regular use.
15 x x Retests
Students must pass (Meet Standards) portions of the TAKS test for their Tested Grade to move on at various
points in the academic career. For these TAKS tests the students are allowed to retest. They are defined as
follows:
Passing Grade 3 Reading is required to promote to the fourth grade.
Page 26
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Passing Grade 5 Reading and Math is required to promote to the sixth grade.
Passing all Grade 11 Test Areas(ELA, Math, Science and Social Science) is required to Graduate High
School. The Grade 11 tests are called the Exit Level.
Exit tests begin in 11th grade, and students who fail continue to be tested through 12th grade for graduation. If
a student does not pass all Test Areas of TAKS before leaving school, he/she is considered a dropout. The
“dropout” can continue to take the TAKS test as many times as necessary until they pass. Students who have
left school (13th grade) are not longer counted in the RISD totals since they have left school.
16 X X X For each Test Area, Tested Grade and Administration the following are sample code evaluations for 2004
TAKS:
A = Absent
X = Student is ARD exempt, do not score (exit level)
L = Student is LEP exempt, do not score (Grades 3 – 10)
P = Previously Met Standard (Grades 3 and 5 and exit level retest administrations)
O = Other (for example, illness, cheating, and so on)
Y = Student did not take the English-version reading test, do not score (Grades 4 and 6 April and Grade 5 June)
Z = Student did not take the Spanish-version reading test, do not score (Grades 4 and 6 April and Grade 5 June)
Q = Student did not take the TAKS reading test, do not score (Grades 3 and 5 February and Grades 4, 6, 7, and 8 April)
S = Score
C = Student did not take the paper-version reading test and an online-version reading test for this student could not be matched to the
student’s paper-version record (Grade 8 and June exit level retest)
W = Parental Waiver: Parent or guardian requested that a student not participate in the third TAKS reading test opportunity (Grades 3
and 5 June administration)
R = ARD Committee has determined after the April test administration that TAKS reading is not appropriate for the student (Grades 3
and 5 June administration)
T = A state-approved alternate assessment was administered instead of TAKS reading (Grades 3 and 5 June administration)
D = No document processed for this subject (Grades 3, 4, 5, 7, 8, 9, 10, and exit level)
Page 27
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
Open Issues
This section can be used to track issues during the requirements gathering phase. Perhaps further clarification is required from a Test Area
Matter Expert (SME) and a meeting needs to be called, etc.
Expected
Number Point of Resolution Issue Description Resolution
Contact Date
1 How is Met Standard defined? Based on Objectives, Met Standards is based on a scaled
Scores or?? Does it just very by Test Area score threshold which varies by Test
(Reading/ELA/Mathematics) or does it vary by Objective Area and Year. The State
and Tested Grade As well? Just a flag on the student’s determines this and sets the Met
record by Test Area? Standards Flag for the Test Area on
the students record.
Page 28
Oracle Internal & Oracle Academy Use Only
As of 7/15/05 TAKS Disco Spec Version 4.0 TAKS_Disco_1_2_3_spec_V4.doc
5 What is the authoritative source for locations? Do Schools may close and new schools
locations ever change? Add new ones, drop old ones or may open. If a school changes type
switch configuration, for example, change from a Jr. High (it can’t physically be relocated) we
to an Elementary School? will assume the old one will close
and the “new” one will have a new
name
6 How does Economic Status match up with Economically Economically Disadvantaged is one
Disadvantaged? Are they the same? Economically of the two values in the Economics
disadvantaged students qualify for free or reduced meal Status dimension.
services based upon their family's income.
7 Define the users who may view the preliminary reports. Only the Assessment team and they
have a separate security group.
8 What are the requirements to track students who have left None at this time.
the district (Grade 13) (Graduated or obtained an
certificate of completion,….)
9 Determine if we can link the Teacher dimension to Teacher can be linked to multiple
Location Dimension. The idea is that the available campuses (the lowest level in the
teachers would only show teachers at the location not all location hierarchy). Part of security
teachers in the district. model.
Page 29
Oracle Internal & Oracle Academy Use Only
Oracle Internal & Oracle Academy Use Only