B414 Restart
B414 Restart
After completing this module, you will be able to: List three different ways to restart the Teradata database. Use the RESTART command. Describe the impact of
Disk(s) failure Disk array controller(s) failure BYNET(s) failure Node failure AWS failure VPROC failure
Explain the difference between a PDE dump and a UNIX panic dump.
Types of Restarts
Scheduled Restarts Changing system parameters (e.g., DBS Control parameter is updated) Software upgrades Configuration changes (addition of new AMPs and/or PEs
Unscheduled Restarts Power failure (e.g., 8/14/2003 the North East U.S. and parts of Canada) Hardware failure Software failure
Accidents
Restart Processes
1. Spool cylinders are returned to free cylinder list (unused cylinder pool). 2. Before logons are enabled, uncommitted work is rolled back.
1st Tables are re-locked for background recovery. 2nd Logons are enabled in cold start.
Scheduled Restarts
Restart Teradata with Use this command Options
Command-line
tpareset <comment>
Example: # tpareset -f Change of system parameters To see when restarts occur and brief explanation of how/why for the last week:
# tpatrace 3
[, COLD] [, COLDWAIT]
COMMENT
# tpatrace
TPA Initialization Trace for Node 001-01 02/16/2004 02/16/2004 02/16/2004 : 02/16/2004 : 02/16/2004 : 02/16/2004 : 02/16/2004 : 02/16/2004 02/16/2004 02/16/2004 08:25:33 -------------------- PDE starting 08:25:35.06 (346) ---- PDE starting. 08:25:35.07 (346) State is NOTPA/START.
PDE States
The pdestate command can be used to check the current state of the PDE and Teradata software for a specific node.
# /usr/ntos/bin/pdestate PDE: Parallel Database Extension state is TPA.
NULL/START NULL/STOPPED NULL/RESET NULL NOTPA/START NOTPA/NETCONFIG NOTPA/NETREADY NOTPA/RECONCILE NOTPA TPA/START TPA/VPROCS TPA/READY TPA/DONE TPA
Unscheduled Restarts
Disk Drive Failures Scenario 1 Failure: Result: Resolution: Scenario 2 Failure: Result: One disk in a drive group No TPA reset Replace disk Array Controllers automatically rebuild the disk
Resolution:
Two disks in a drive group TPA reset (1-5 minutes) AMP taken offline and marked as Fatal Fallback tables OK Non-fallback tables partially available Replace the two disks Reformat LUNs or Volumes in the drive group Perform a table rebuild Restore non-fallback tables
Scenario 3 Failure:
Result: Resolution:
Two disks in 2 different drive groups associated with AMPs in the same cluster 2 AMPs fail in a cluster Machine halts Restore User DBC and tables
Resolution:
Both BYNETs fail Teradata halts and is not available Repair BYNETs
AMP or PE Vproc fails TPA restart (1 - 5 minutes) and vprocs may be marked offline If necessary, run Scandisk, Checktable, and Rebuild utilities AWS Failure
AWS fails No restart of Teradata; AWS is not available to monitor/manage system Reboot or recover AWS
Collector Task
AMP
AMP
AMP
AMP
Crashdump Table
1. Selective memory and swapped pages are written to pdedump space. 2. As part of Teradata restart, a background collector task reads pdedump and writes dump information to a Crashdump table in Crashdumps database.
If the Crashdumps database is out of perm space, the collector task outputs a
warning message and retries every 60 minutes to create a crashdump table. UNIX MP-RAS Commands to determine if dumps are present in pdedump: # pdedumpcheck -v # fdlcsp - mode clear (lists /dev/pdedump dumps that are present) (clears all dumps from /dev/pdedump)
Crashdumps
Sys_Calendar
SysAdmin
SYSDBA
SystemFE
Allocate approximately 150 200 MB of permanent space per node per crashdump. Example: Four-node system and you want to allocate space for three Crashdumps: ((150 x 4) x 3) = 1800 MB without fallback ((150 x 4) x 3) x 2 = 3600 MB with fallback MODIFY USER Crashdumps AS PERM = 1800E6; Example of Crashdump name: Crash_20040213_012519_02 (Date) (Time) (Segment #)
Archive to file and ftp to support center Use DUL and archive to tape
PDE Kernel
Crash utility may be used to interpret dump.
Review Questions
1. What is the operating system command to restart Teradata? __________________ 2. What is the DB Window supervisor command to restart Teradata? __________________ 3. Which of the following choices will cause a Teradata restart? __________________ A. B. C. D. E. F. G. AWS hard drive failure Single drive failure in RAID 1 drive group Two drive failures in same RAID 1 drive group Single SMP power supply failure SMP CPU failure One of BYNETs fails LAN connection to SMP is lost
2. What is the DB Window supervisor command to restart Teradata? restart tpa 3. Which of the following choices will cause a Teradata restart? A. B. C. D. E. F. G. AWS hard drive failure Single drive failure in RAID 1 drive group Two drive failures in same RAID 1 drive group Single SMP power supply failure SMP CPU failure One of BYNETs fails LAN connection to SMP is lost C, E