Solaris Faq MNC
Solaris Faq MNC
Solaris Faq MNC
Posted by Vivaan
1. What does the pkgadd command do?
7. What is VTS?
17. What is the difference between NFS version 2 and NFS version 3?
You need to boot from your install CD. Insert the Solaris Software CD in your CDROM drive. If your CDROM
drive/BIOS isn't bootable, first insert the "Device Configuration Assistant" (DCA) diskette. At the "Boot
Solaris" menu, choose "CD."
Once you're at the UNIX root prompt #, you can mount the boot drive with "mount /dev/dsk/c0t0d0s0 /mnt""
and view anything wrong with the boot drive (omit the "t0" for ATAPI).
This may happen when installing a boot manager that comes with another operating system (such as LILO
from Linux) or an after-market multi-OS boot manager. These sometimes trample's active partition, which in
our case is Solaris. Also, moving the Solaris partition with a partition manager program such as Partition
Magic requires reinstalling the Solaris boot block. Before taking these steps, first verify the Solaris partition is
active. If it isn't, just make the Solaris partition active and reboot. Otherwise follow the steps below.
1. Boot from CD-ROM and get the root prompt, #, as described in the previous question, 7.1.
2. Determine the controller, disk number, and partition. The boot disk is /dev/rdsk/c?t?d?p? where ? is the
controller #, target ID, and disk #, and partition #. Omit "t?" for ATAPI E.g., /dev/rdsk/c0d0p0
3. Verify it's the correct device correct with prtvtoc for the drive: This is VERY important; if it's wrong, you
you may hose another partition: prtvtoc /dev/rdsk/c0t0d0p0 (omit "t0" for ATAPI, always use p0, which
means the "entire drive"). The prtvtoc prints out the map for the Solaris partition on the hard drive, if found.
The partitions shown on the output are actually "slices" within the Solaris partition.
5. Finally, remove your CDROM and diskette media and type "/sbin/shutdown -i6" to reboot. The Solaris
Multiple Device Boot Menu should appear after rebooting. If not, you can always to an upgrade (re-)install.
Note: This procedure does NOT make your Solaris partition active again (sometimes needed after installing
another operating system, such as Windows, on the same disk), it just writes to your bootblock IN your
Solaris partition. To learn more about the Solaris boot process, read the boot(1M) man page.
3)How do I logon as root if the password doesn't work anymore?
Regaining control of a Solaris x86 system where the root password has been lost can be accomplished by
the following steps. Note that any savvy user can do this with the proper CD-ROM and diskette. Therefore,
of course, physical security of a system is important for machines containing sensitive data.
1. Insert installation boot diskette and installation CD-ROM for Solaris x86.
2. Boot system from the installation floppy and select the CD-ROM as the boot device.
3. Type "b -s" (instead of typing 1 or 2 from the menu) and it'll drop you straight to a root shell, #, (and you'll
be in single-user mode).
4. At the root prompt, #, key in the following commands, which will create a directory called hdrive under
the /tmp directory and then mount the root hard drive partition under this temporary directory.
5. mkdir /tmp/hdrive
7. To use the vi editor, the TERM variable must be defined. Key in the following commands.
8. TERM=at386
9. export TERM
10. Start vi (or some other editor) and load /tmp/hdrive/etc/shadow file:
11. vi /tmp/hdrive/etc/shadow
12. Change the first line of the shadow file that has the root entry to:
13. root::6445::::::
14. Write and quit the vi editor with the "!" override command:
15. :wq!
16. Remove the floppy installation diskette, and reboot the system:
18. When system has rebooted from the hard drive, you can now log in from the Console Login: as root with
no password. Just hit enter for the password.
19. After logging in as root, use the passwd command to change the root password and secure the system.
5)Why is Solaris always booting into the Device Configuration Assistant (DCA)?
• You didn't remove your DCA boot diskette or if you didn't remove your installation CD-ROM if it's in a
bootable CD-ROM drive.
• File /boot/solaris/bootenv.rc is corrupt or truncated, usually after a hard reboot or reset. This file is setup
and used by DCA. It should contain several lines.
To change or set your default boot device, See Sun FAQ 2271-02 at http://access1.Sun.COM/cgi-
bin/rinfo2html?227102.faq for instructions. To summarize:
• On the "Boot Tasks" screen, press Enter to place an "X" in front of "View/Edit Autoboot Settings."
• In the "View/Edit Autoboot Settings" screen, note that the Default Boot Device will not be set to any valid
device. Place an "X" in front of Set Default Boot Device and press F2 (Continue).
• On the Set Default Boot Device screen, place an X in front of the correct disk and press F2 (Continue).
• Arrow up to the Accept Settings and press Enter to mark with an "X". Press F2 (Continue) to return to the
Boot Tasks screen.
• Press F3 (Back). It will load appropriate drivers after which you will be at the Boot Solaris screen. Press F2
(Continue) to continue booting
7)I get this error message: "can't get local host's domain name" or "The local host's domain name hasn't
been set." What do I do?
This is a NIS message. The easiest way to fix it is to type the following as root:
domainname abc.com; domainname >/etc/defaultdomain
8)My system doesn't boot due to superblock problems with the root filesystem. What do I do?
Normally, you reboot in single user mode and run /usr/bin/fsck as root and everything is OK. If you get a
message about errors/problems on /dev/dsk/c0d0s0, are told to run fsck manually in single user mode, and
get this message:
BAD SUPER BLOCK: BAD VALUES IN SUPERBLOCK USE AN ALTERNATIVE SUPERBLOCK to
SUPPLY NEEDED INFORMATION e.g. fsck -F ufs -b=# [special].
then you may be able to recover from this if the disk isn't entirely corrupted. The superblock stores important
information about the file system. Because it is so important it is duplicated in several places. Hopefully one
of the backup superblocks isn't corrupted. To see duplicate locations of superblock, use newfs -Nv. For
example, if your root slice is at /dev/dsk/c0d0s0, run this command:
# newfs -Nv /dev/dsk/c0d0s0 You must specify -Nv so you don't clobber your root slice with a new
filesystem. Your output should look like this:
# newfs -Nv /dev/dsk/c0d0s0
mkfs -F ufs -o N /dev/rdsk/c0d0s0 614880 63 16 8192 1024 16 10 60 2048 t
0 -1 8
7n
/dev/rdsk/c0d0s0: 614880 sectors in 610 cylinders of 16 tracks, 63
sectors
300.2MB in 39 cyl groups (16 c/g, 7.88MB/g, 3776 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 16224, 32416, 48608, 64800, 80992, 97184, 113376, 129568, 145760,
468576, 484768, 500960, 516128, 532320, 548512, 564704, 580896, 597088,
613280,
Note the numbers following "super-block backups." Use one of the numbers in fsck (e.g., 32) and use it with
the fsck -F -o b= option:
# fsck -F ufs -o b=32
You may get a message FREE BLK COUNT(S) WRONG IN SUPERBLOCK SALVAGE? or FILE SYSTEM
STATE IN SUPERBLOCK IS WRONG; FIX? In either case, type "yes" and press return. You should get a
FILE SYSTEM WAS MODIFIED message. Reboot your system. If system complains about shutdown not
being found do a halt -q. Now, hopefully, your system will boot up with out any problems.
9)Changing a hostname
The following steps are required to change a Sun system's hostname.
• /etc/hosts.allow (to correct access permissions)
• /etc/dfs/dfstab on this system's NFS servers (to allow proper mount access)
• /etc/vfstab on this system's NFS clients (so they will point at the correct server)
• kerberos configurations
• ethers and hosts NIS maps
• DNS information
• Netgroup information
• cron jobs should be reviewed.
• Other hostname-specific scripts and configuration files.
Additional steps may be required in order to correct issues involving other systems.
Having said all that, the minumum number of changes required are:
• /etc/nodename
• /etc/hosts
• /etc/hostname.*
• /etc/net/*/hosts
10)NFS Troubleshooting
Sun's web pages contain substantial information about NFS services; search for an NFS Administration
Guide or NFS Server Performance and Tuning Guide for the version of Solaris you are running. The
share_nfs man page contains specific information about export options.
If NFS is not working at all, try the following:
• Make sure that the NFS server daemons are running. In particular, check for statd, lockd, nfsd and rarpd. If
the daemons are not running, they can be started by running /etc/init.d/nfs.server start. See Daemons below
for information on NFS-related daemons.
• Check the /etc/dfs/dfstab and type shareall.
• Use share or showmount -e to see which filesystems are currently exported, and to whom. showmount -a
shows who the server believes is actually mounting which filesystems.
• Make sure that your name service is translating the server and client hostnames correctly on both ends.
Check the server logs to see if there are messages regarding failed or rejected mount attempts; check to
make sure that the hostnames are correct in these messages.
• Make sure that the /etc/net/*/hosts files on both ends report the correct hostnames. Reboot if these have to
be edited.
If you are dealing with a performance issue, check
• Network Issues
• CPU Useage
• Memory Levels
• Disk I/O
• Increase the number of nfsd threads in /etc/init.d/nfs.server if the problem is that requests are waiting for a
turn. Note that this does increase memory useage by the kernel, so make sure that there is enough RAM in
the server to handle the additional load.
• Where possible, mount filesystem with the ro option to prevent additional, unnecessary attribute traffic.
• If attribute caching does not make sense (for example, with a mail spool), mount the filesystem with the
noac option. If nfsstat reports a high getattr level, actimeo may need to be increased (if the attributes do not
change too often).
• nfsstat reports on most NFS-related statistics. The nfsstat page includes information on tuning suggestions
for different types of problems that can be revealed with nfsstat.
If these steps do not resolve the issue, structural changes may be required:
• cachefs can be used to push some of the load from the NFS server onto the NFS clients. To be useful,
cfsadmin should be used to increase maxfilesize for the cache to a value high enough to allow for the
caching of commonly-used files. (The default value is 3 Mb.)
11)NFS Client
When a client makes a request to the NFS server, a file handle is returned. The file handle is a 32 byte
structure which is interpreted by the NFS server. Commonly, the file handle includes a file system ID, inode
number and the generation number of the inode. (The latter can be used to return a "stale file handle" error
message if the inode has been freed and re-used between client file accesses.)
If a response is not received for a request, it is resent, but with an incremented xid (transmission ID). This
can happen because of congestion on the network or the server, and can be observed with a snoop session
between server and client.
The server handles retransmissions differently depending on whether the requests are idempotent (can be
executed several times without ill effect) or nonidempotent (cannot be executed several times). Examples of
these would include things like reads and getattrs versus writes, creates and removes. The system
maintains a cache of nonidempotent requests so that appropriate replies can be returned.
Daemons
The following daemons play a critical role in NFS service:
• biod: On the client end, handles asynchronous I/O for blocks of NFS files.
• nfsd: Listens and responds to client NFS requests.
• mountd: Handles mount requests.
• lockd: Network lock manager.
• statd: Network status manager
iostat
As with most of the monitoring commands, the first line of iostat reflects a summary of statistics since boot
time. To look at meaningful real-time data, run iostat with a time step (eg iostat 30) and look at the lines that
report summaries over the time step intervals.
For Solaris 2.6 and higher, use iostat -xPnce 30 to get information including the common device names of
the disk partitions, CPU statistics, error statistics, and extended disk statistics.
For Solaris 2.5.1 and earlier, or for more compact output, use iostat -xc 30 to get the extended disk and CPU
statistics.
In either case, the information reported is:
• disk: Disk device name.
• r/s, w/s: Average reads/writes per second.
• Kr/s, Kw/s: Average Kb read/written per second.
• wait: Time spent by a process while waiting for block
(eg disk) I/O to complete. (See Notes on Odd Behavior below.)
• actv: Number of active requests in the hardware queue.
• %w: Occupancy of the wait queue.
• %b: Occupancy of the active queue with the device busy.
• svc_t: Service time (ms). Includes everything: wait time, active queue time, seek rotation, transfer time.
• us/sy: User/system CPU time (%).
• wt: Wait for I/O (%).
• id: Idle time (%).
The "wait" time reported by iostat refers to time spent by a process while waiting for block device (such as
disk) I/O to finish. In Solaris 2.6 and earlier, the calculation algorithm sometimes overstates the problem on
multi-processor machines, since it does not take into account that an I/O wait on one CPU does not mean
that I/O is blocked for processes on the other CPUs. Solaris 7 has corrected this problem.
iostat also sometimes reports excessive svc_t (service time) readings for disks that are very inactive. This is
due to the action of fsflush keeping the data in memory and on the disk up-to-date. Since many writes are
specified over a very short period of time to random parts of the disk, a queue forms briefly, and the average
service time goes up. svc_t should only be taken seriously on a disk that is showing 5% or more activity.
15)mpstat
mpstat reports information which is useful in understanding lock contention and CPU loading issues.
mpstat reports the following:
• CPU: Processor ID
• minf: Minor faults
• mjf: Major fault
• xcal: Processor cross-calls (when one CPU wakes up another by interrupting it). If this exceeds
200/second, the application in question may need to be examined.
• intr: Interrupts.
• ithr: Interrupts as threads (except clock).
• csw: Context switches
• icsw: Involuntary context switches (this is probably the more relevant statistic when examining performance
issues.)
• migr: Thread migrations to another processor. If the migr measurement of mpstat is greater than 500,
rechoose_interval should be sent longer in the kernel.
• smtx: Number of times a CPU failed to obtain a mutex.
• srw: Number of times a CPU failed to obtain a read/write lock on the first try.
• syscl: Number of system calls.
• usr/sys/wt/idl: User/system/wait/idle CPU percentages.
netstat
netstat provides useful information regarding traffic flow.
In particular, netstat -i lists statistics for each interface, netstat -s provides a full listing of several counters,
and netstat -rs provides routing table statistics. netstat -an reports all open ports.
netstat -k provides a useful summary of several network-related statistics up through Solaris 9, but this
option was removed in Solaris 10 in favor of the /bin/kstat command. Through Solaris 9, netstat -k provides
a listing of several component kstat statistics.
Here are some of the issues that can be revealed with netstat:
• netstat -i: (Collis+Ierrs+Oerrs)/(Ipkts+Opkts) > 2%: This may indicate a network hardware issue.
• netstat -i: (Collis/Opkts) > 10%: The interface is overloaded. Traffic will need to be reduced or redistributed
to other interfaces or servers.
• netstat -i: (Ierrs/Ipkts) > 25%: Packets are probably being dropped by the host, indicating an overloaded
network (and/or server). Retransmissions can be dropped by reducing the rsize and wsize mount
parameters to 2048 on the clients. Note that this is a temporary workaround, since this has the net effect of
reducing maximum NFS throughput on the segment.
• netstat -s: If significant numbers of packets arrive with bad headers, bad data length or bad checksums,
check the network hardware.
• netstat -i: If there are more than 120 collisions/second, the network is overloaded. See the suggestions
above.
• netstat -i: If the sum of input and output packets is higher than about 600 for a 10Mbs interface or 6000 for
a 100Mbs interface, the network segment is too busy. See the suggestions above.
• netstat -r: This form of the command provides the routing table. Make sure that the routes are as you
expect them to be.
nfsstat
nfsstat can be used to examine NFS performance.
nfsstat -s reports server-side statistics. In particular, the following are important:
• calls: Total RPC calls received.
• badcalls: Total number of calls rejected by the RPC layer.
• nullrecv: Number of times an RPC call was not available even though it was believed to have been
received.
• badlen: Number of RPC calls with a length shorter than that allowed for RPC calls.
• xdrcall: Number of RPC calls whose header could not be decoded by XDR (External Data Representation).
16)sar
The word "sar" is used to refer to two related items:
1. The system activity report package
2. The system activity reporter
System Activity Report Package
This facility stores a great deal of performance data about a system. This information is invaluable when
attempting to identify the source of a performance problem.
The Report Package can be enabled by uncommenting the appropriate lines in the sys crontab. The sa1
program stores performance data in the /var/adm/sa directory. sa2 writes reports from this data, and sadc is
a more general version of sa1.
In practice, I do not find that the sa2-produced reports are terribly useful in most cases. Depending on the
issue being examined, it may be sufficient to run sa1 at intervals that can be set in the sys crontab.
Alternatively, sar can be used on the command line to look at performance over different time slices or over
a constricted period of time:
(Here, "5" represents the time slice and "2000" represents the number of samples to be taken. "outfile" is the
output file where the data will be stored.)
The data from this file can be read by using the "-f" option (see below).
o iget/s: Rate of requests for inodes not in the DNLC. An iget will be issued for each path component of the
file's path.
o namei/s: Rate of file system path searches. (If the directory name is not in the DNLC, iget calls are made.)
o bread/s, bwrit/s: Transfer rates (per second) between system buffers and block devices (such as disks).
o pread/s, pwrit/s: Transfer rates between system buffers and character devices.
o sread/s, swrit/s, fork/s, exec/s: Call rate for these calls (per second).
o avserv: Average service time (ms). (For block devices, this includes seek rotation and data transfer times.
Note that the iostat svc_t is equivalent to the avwait+avserv.)
• -f filename: Use filename as the source for the binary sar data. The default is to use today's file from
/var/adm/sa.
o %ufs_ipf: Percentage of UFS inodes removed from the free list while still pointing at reuseable memory
pages. This is the same as the percentage of igets that force page flushes.
o sml_mem: Amount of virtual memory available for the small pool (bytes). (Small requests are less than
256 bytes)
o lg_mem: Amount of virtual memory available for the large pool (bytes). (512 bytes-4 Kb)
o ovsz_alloc: Memory allocated to oversize requests (bytes). Oversize requests are dynamically allocated,
so there is no pool. (Oversize requests are larger than 4 Kb)
o alloc: Amount of memory allocated to a pool (bytes). The total KMA useage is the sum of these columns.
o atch/s: Attaches (per second). (This is the number of page faults that are filled by reclaiming a page
already in memory.)
o ppgin/s: Page-ins (per second). (Multiple pages may be affected by a single request.)
o vflts/s: Address translation page faults (per second). (This happens when a valid page is not in memory. It
is comparable to the vmstat-reported page/mf value.)
o slock/s: Faults caused by software lock requests that require physical I/O (per second).
• -q: Run queue length and percentage of the time that the run queue is occupied.
o freemem: Pages available for use (Use pagesize to determine the size of the pages).
o %wio: Waiting for I/O (does not include time when another process could be schedule to the CPU).
o proc-sz: Number of process entries (proc structures) currently in use, compared with max_nprocs.
o inod-sz: Number of inodes in memory compared with the number currently allocated in the kernel.
o file-sz: Number of entries in and size of the open file table in the kernel.
o lock-sz: Shared memory record table entries currently used/allocated in the kernel. This size is reported as
0 for standards compliance (space is allocated dynamically for this purpose).
o swpin/s, swpot/s, bswin/s, bswot/s: Number of LWP transfers or 512-byte blocks per second.
o rawch/s, canch/s, outch/s: Input character rate, character rate processed by canonical queue, output
character rate.
18)vmstat
The first line of vmstat represents a summary of information since boot time. To obtain useful real-time
statistics, run vmstat with a time step (eg vmstat 30).
The vmstat output columns are as follows use the pagesize command to determine the size of the pages):
• procs or kthr/r: Run queue length.
• procs or kthr/b: Processes blocked while waiting for I/O.
• procs or kthr/w: Idle processes which have been swapped.
• memory/swap: Free, unreserved swap space (Kb).
• memory/free: Free memory (Kb). (Note that this will grow until it reaches lotsfree, at which point the page
scanner is started. See "Paging" for more details.)
• page/re: Pages reclaimed from the free list. (If a page on the free list still contains data needed for a new
request, it can be remapped.)
• page/mf: Minor faults (page in memory, but not mapped). (If the page is still in memory, a minor fault
remaps the page. It is comparable to the vflts value reported by sar -p.)
• page/pi: Paged in from swap (Kb/s). (When a page is brought back from the swap device, the process will
stop execution and wait. This may affect performance.)
• page/po: Paged out to swap (Kb/s). (The page has been written and freed. This can be the result of activity
by the pageout scanner, a file close, or fsflush.)
• page/fr: Freed or destroyed (Kb/s). (This column reports the activity of the page scanner.)
• page/de: Freed after writes (Kb/s). (These pages have been freed due to a pageout.)
• page/sr: Scan rate (pages). Note that this number is not reported as a "rate," but as a total number of
pages scanned.
• disk/s#: Disk activity for disk # (I/O's per second).
• faults/in: Interrupts (per second).
• faults/sy: System calls (per second).
• faults/cs: Context switches (per second).
• cpu/us: User CPU time (%).
• cpu/sy: Kernel CPU time (%).
• cpu/id: Idle + I/O wait CPU time (%).
•
vmstat -i reports on hardware interrupts.
vmstat -s provides a summary of memory statistics, including statistics related to the DNLC, inode and rnode
caches.
vmstat -S reports on swap-related statistics such as:
• si: Swapped in (Kb/s).
• so: Swap outs (Kb/s).
(Note that the man page for vmstat -s incorrectly describes the swap queue length. In Solaris 2, the swap
queue length is the number of idle swapped-out processes. (In SunOS 4, this referred to the number of
active swapped-out processes.)