How do I ....

What is currently installed

Check out the $INFORMIXDIR/etc/*cr files, this will list the various products and versions. The $INFORMIXDIR/etc/.snfile will also give you the installation order

Tell if the Engine is offline

command: onstat -u 

INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 00:18:03 -- 147744 Kbytes
If the words 'On-Line' do not appear in the returned line then there could be a problem. Check the log directly if the engine isn't running, otherwise use onstat -m

If the engine remains in 'Fast Recovery' for a period and there is no disk activity onstat -D restart the instance. The instance should go Online.

Tell if the logs are full

command: onstat -l or onstat -m

onstat -l
onstat -m

One of the possible causes is continuous logging has stopped or failed.

Command: ps -ef | egrep ontape 

informix  795     1 0 09:32:31 0 0:50 /infx/bin/ontape -c
informix 12899 6863 1 12:59:34 2 0:00 egrep ontape

This indicates that continuous logging is running on one instance but not necessarily on the correct instance. Once the ontape has been running for a reasonable length of time it should be accumulating CPU time. If it isn't then the program might be waiting for user intervention.

If you are running in multi-instance environment then you need to check you are looking at the correct instance

Command: onstat -u
 

This identifies the ‘monitor’ threads running in the instance, onstat -u

INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 14:54:57 -- 195680 Kbytes 

Userthreads 
address    flags sessid     user tty    wait tout locks nreads nwrites
40024010 ---P--D    0   informix  -        0   0     0     189    570
40024444 ---P--F    0   informix  -        0   0     0       0      0 
40024878 ---P--F    0   informix  -        0   0     0       0      0
40024cac ---P--F    0   informix  -        0   0     0       0      0 
4002d760 Y--P--- 2405       eric  - 60d55a98   0     1     178    244 
4002db94 ---P--M 2313      admin  -        0   0     0       0      0  <Monitor>
4002dfc8 Y--P--- 2474       eric  - 4021fd98   0     3       9      3
. 
.
.
400348dc Y--P--- 2466      eric  -  404bb4dc   0     3      23     46
40035144 Y--P--- 2476      eric  -  404a38ac   0     1      18     15

51 active, 128 total, 71 maximum concurrent

If a monitor thread is found, indicated by the M in the flags column, then check to see if it is the correct thread.

Command: onstat -g ses <sessid>
 

onstat -g ses

INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 14:56:54 -- 195680 Kbytes 

session                            RSAM    total     used 
  id     user  tty  pid hostname threads  memory   memory
2313     admin  -  9795  eric        1   4251648  4217944 

tid    name   rstcb    flags  curstk  status 
2532 ontape 4002db94 ---P--M    1660  sleeping(secs: 4)
....

An ontape thread is running within this instance. If the system is not being archived then some form of logging is in place. A netnorm condition means the program is waiting user intervention.

Get a serial value when there isn't one in the table

Author: Carsten Haese

create sequence tmpseq;
  
select tmpseq.nextval, sometable.id
  from sometable
  order by sometable.id;
  
drop sequence tmpseq;

use dostats

While dostats is not a complicated program to use but it does deserve it's own section

if a single user is hogging the system

command: onstat -u

onstat -l. Look for a user that has significantly more read and/or writes than anybody else.

Userthreads
address   flags   sessid user  tty   wait tout locks nreads nwrites
40024010 ---P--D   0 informix  -        0  0     0    568     116 
40024444 ---P--F   0 informix  -        0  0     0      0       0
40024878 ---P--F   0 informix  -        0  0     0      0       0
40024cac ---P--F   0 informix  -        0  0     0      0       0
400250e0 ---P--F   0 informix  -        0  0     0      0       0
40025514 ---P--F   0 informix  -        0  0     0      0       0
40025948 ---P--F   0 informix  -        0  0     0      0       0
40025d7c ---P--F   0 informix  -        0  0     0      0       0
400261b0 ---P--F   0 informix  -        0  0     0      0       0
400265e4 ---P--B   9 informix  -        0  0     0      0       1
40026a18 ---P--D   0 informix  -        0  0     0      0       0
400276b4 Y--P--- 627     eric  - 402597e0  0     3      0       3

40027ae8 Y--P--- 649     eric  2 404ac790  0     1   2028     578

40027280 Y--P--- 668 informix  9 4049dcc4  0     1      0      14
400276b4 Y--P--- 625     eric  - 402597e0  0     3      0       3
40027ae8 Y--P--- 649 informix  2 404ac790  0     1      2       0
16 active, 128 total, 24 maximum concurrent

This can have a number of causes.

  • Run away SQL i.e. there is no longer a controlling process Check
  • Poor SQL. check the SQL for the problem session Check
  • PC switched off without disconnecting from the database Check

Which Informix Programs have no Controlling Process

Command: ps -ef
 
USER       PID   PPID  C   STIME   TTY  TIME    CMD 
root         1      0  2   09 May   -   39:39  /etc/init 
root      1699      1  0   09 May   -    0:00  /etc/uprintfd 
root      2852      1  0   09 May   -   14:49  /etc/syncd 60 
root      3431   3933  0   09 May   -    0:04  /etc/syslogd 
..
.. 
informix 16193  24690  0 12:36:35 pts/9  0:00  dbaccess db1
informix 16757      1  4   01 Jun   -   10:04  oninit -s 
root     16965   5750  0 10:30:00   -    0:00  telnetd 
informix 17275  16757  0   01 Jun   -    1:04  oninit -s 
root     18102   5750  0 17:26:00   -    0:00  telnetd 
informix 18808  16757  0   01 Jun   -    0:22  oninit -s 
informix 19587  13186  0 12:16:22 pts/2  0:00  -ksh 
informix 20092  15739  0 10:57:06 pts/5  0:00  -ksh 
informix 20417  19136  0 17:26:31 pts/3  0:00  -ksh 
informix 20598  16757  0   01 Jun   -    0:25  oninit -s 
informix 20851  25015  5 12:44:05 pts/1  0:00  ps -ef 
informix 21062  16965  0 10:30:00 pts/7  0:00  -ksh 
informix 21623  16757  0   01 Jun   -    0:20  oninit -s 
informix 24462      1  0 12:16:34 pts/2 12:23  dbaccess eric   <Lost Process>
informix 24690  13425  0 11:52:33 pts/9  0:00  -ksh 
informix 25015  18102  0 17:26:00 pts/1  0:00  -ksh

The most likely candidates are programs that have a PID of 1. The caveat are jobs being run by cron, and nohupped jobs from previous sessions. If the ‘possible’ runaway is attached to a terminal then it is more likely to be the trouble maker.

For pre Version 6 engines each individual user process spawns a process (sqlexec or sqlturbo). The same rules apply.

How to identify SQL for a session

  
Command: onstat -g sql <session id>
onstat -sql.
INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 02:53:23 -- 43040 Kbytes
Sess      SQL     Current   Iso   Lock   SQL  ISAM   F.E.
Id     Stmt type  Database  Lvl   Mode   ERR   ERR   Vers 
32303      -          eric   DR  Wait 5   0     0    5.01   

Current SQL statement : 

SELECT col1, col2 FROM eric_tab WHERE col2 = 9 AND col1 IN 
("AE","AI","AL","ASH","AA","AR","AO","DA","DU","DF","OJW","PE",
"PO","PI","PU","PNR","WH")

Copy the current SQL statement into dbaccess and run it with set explain on;

Don't run onstat -g sql on a very busy server it will never return and you will not be able to kill the session it starts. Because of the very dynamic nature of this particular table, and the fact that rows are constantly being dropped off the beginning of it and appended to the end, your query will chase its tail until the server is shutdown.

Tune the Data Dictonary

Reduce checkpoint times

Author: Jonathan Leffler

If the performance nose-dives at checkpoint time, seriously consider reducing LRU_MIN_DIRTY and LRU_MAX_DIRTY. Don't try to go below LRU_MIN_DIRTY 1, LRU_MAX_DIRTY 2. The default values (of 30/20 in 4.00 and 60/50 are, in my not so very humble opinion, rubbish for any system with more than about 1 MB of shared memory. Most databases are read a lot more than they are modified.

Author: Art Kagel

I agree that additional cleaners are of limited use during normal LRU writes but they are needed for fastest checkpointing. Since at checkpoint time each chunk is assigned to another cleaner thread until the threads are exhausted, and since that one thread, as has already been pointed out, is only scheduling the actual I/Os with either the AIO VPs or KAIO VPs, that thread will block on each I/O that it schedules and single thread writes to your disks. You must have multiple cleaners, even with a single CPU VP, since the other cleaners can wake and schedule the I/Os that they are responsible for while the first cleaner is blocked waiting for I/O service…My point is that the single cleaner thread has to wait for the issued I/O to complete. It does relinquish the CPU VP to other threads so that other work is not blocked

So I recommend 1.5 * min(#chunks,#LRUs) as the number of cleaners keep the 2-2.5 * NUMCPUs idea for multiple CPUs in mind and constrained by a maximum useful value of about 64 for multiple CPU systems.

get Random Numbers

Within SQL there is no random number routines.

SPL

Author: Jonathan Leffler

@(#)$Id: random.spl,v 1.2 1997/12/08 19:31:44 johnl Exp $

Simple emulation of SRAND and RAND in SPL
Using random number generator suggested by C standard (ISO 9899:1990)

CREATE PROCEDURE sp_setseed(n INTEGER)
      DEFINE GLOBAL seed DECIMAL(10) DEFAULT 1;

      LET seed = n;

END PROCEDURE;

CREATE PROCEDURE sp_random() RETURNING INTEGER;
     DEFINE GLOBAL seed DECIMAL(10) DEFAULT 1;
     DEFINE d DECIMAL(20,0);

     LET d = (seed * 1103515245) + 12345;
     -- MOD function does not handle 20-digit values
     LET seed = d - 4294967296 * TRUNC(d / 4294967296);
 
    RETURN MOD(TRUNC(seed / 65536), 32768);

END PROCEDURE;

C UDR

The following Random Number generator will generate a C based UDR

use the High Performance Loader

identify outstanding In Place Alters

Who is holding a lock

Command: onstat -k

onstat -k

Locks 
address   wtlist    owner lklist      type  tblsnum rowid key#/bsiz 
30258128    0    400276b4        0   HDR+X  100002    208    0
302583e8    0    40027280        0   HDR+S  100002    20d    0
30258414    0    40028350 30258548   HDR+X  a0004f  4e50f    0
3025851c    0    40028350        0       S  100002    20d    0
30258548    0    40028350 3025851c  HDR+IX  a0004f      0    0  

5 active, 50000 total, 32768 hash buckets

The owner of a lock can be traced to a particular user using the onstat -u command and looking for the owner address detailed in the onstat -k. This gives the username. The following SQL submitted against the sysmaster database will provide summary information.

select dbsname database, username, hostname, 
    tabname, type 
  from syslocks, syssessions 
  where sid = $owner
    and syslock.sid = sysessions.sid 

Which table is locked

Command: onstat -k

onstat -k

Locks 

address   wtlist     owner lklist    type   tblsnum rowid key#/bsiz
30258128       0 400276b4        0   HDR+X   100002   208     0 
302583e8       0 40027280        0   HDR+S   100002   20d     0 
30258414       0 40028350 30258548   HDR+X   a0004f 4e50f     0
3025851c       0 40028350        0       S   100002   20d     0
30258548       0 40028350 3025851c  HDR+IX   a0004f     0     0

5 active, 50000 total, 32768 hash buckets

An individual table tblsnum can be converted into a database tablename using the following SQL.

select tabname from systables where hex(partnum) = $tblsnum; 

See the memory usage

Command: onstat -g seg

onstat -g seg

INFORMIX-OnLine Version 10.10.UD2 -- On-Line -- Up 01:48:44 -- 147744 Kbytes

Segment Summary: 
(resident segments are not locked) 
id         key       addr    size   ovhd class  blk   blk 
                                                used  free
40965 1384007681 30000000 69369856  1860   R    8464     4
40966 1384007682 40000000 81920000  1844   V     723  9277
 4106 1384007683 50000000  1105920   612   M     132     3
 4107 1384007684 60000000  1105920   612   M      20   115
 4108 1384007684 70000000  1105920   612   M       0   135

When Informix requests and is allocated memory it assumes that if the memory was needed once it will be needed again and does not release it back to the operating system. However, the instance can be forced to release memory to the operating system if it is no longer in use.

Command: onmode -F

onstat -F

INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 01:52:38 -- 147744 Kbytes

Segment Summary: 
(resident segments are not locked) 
id         key       addr    size   ovhd class  blk   blk 
                                               used  free
40965 1384007681 30000000 69369856  1860   R    8464     4
40966 1384007682 40000000 81920000  1844   V     723  9277
4106 1384007683 50000000  1105920   612   M     132     3
4107 1384007684 60000000  1105920   612   M      20   115

If a segment can not be released to the operating system then the instance will attempt to release as much memory as possible from within the block.

Large ready queues and the addition of new segments might mean someone is running a large query with a high PDQPRIORITY.

Other considerations include

See if there is a Memory Shortage

sar -g is a good indicator of whether more memory may be needed. Use the ps -elf command to show the number of cycles used by the page daemon. A high number of cycles, combined with high values for pgfree/s and pgscan/s indicates a memory shortage.

Alternatively, use vmstat. Ignore the page in and out columns [pi/po] and instead look at the scan rate [sr] and the anticipated short-term memory shortfall [de] columns. The scan rate should be less than 200. The de column is slightly odd in the fact that it can peak very high, due to the pre-emptive paging algorithms in the OS, but should fall back very quickly. On SVR4 you have serious memory problems if you are swapping

Do not confuse swapping with paging, paging is how the memory management sub-system looks after itself, swapping is a cry for help from an under-resourced machine. The only time swap in required is at the start of a process when the kernel determines whether the swappable image of the process can fit into the available swap space. It doesn't use the swap it just checks to see if it's available.

See if there is a Memory Leak

Any driver that uses Kernel Memory Allocator (KMA) resources, but does not specifically release them before exiting can create a memory leak. A memory leak causes the amount of memory allocated by KMA to increase over time. Thus, if the alloc field increases steadily over time, they may be a memory leak. Another indication of a memory leak is failed requests. If this occurs, then it is likely that a memory leak has caused KMA to be unable to reserve and allocate memory.

Detecting memory leaks on a live system can be very difficult. However, if luxury of a test system is available then it is relatively straightforward to monitor the engine's memory usage and determine whether it is slowly increasing over time.

tell if the SQL still processing data

Unless the machine has a very large amount of memory or the query is very simple the engine will nearly always have to go to disk to get the data/index pages. [Remember later engines can keep some user-definable tables in memory]. Therefore, the onstat -u for the running sessions should show disk reads or writes. Once the data is in memory it is processed from within the buffers and buffer activity for the session needs to be checked. For this, the thread id for the session is required.

Command: onstat -g tpf <thread id>

onstat -g tpf

INFORMIX-OnLine Version 10.00.UD2 -- On-Line -- Up 00:39:56 -- 147744 Kbytes 

Thread profiles
tid lkreqs lkw dl to lgrs isrd iswr isrw isdl isct isrb lx bfr  bfw  lsus lsmx seq
   0    0   0   0  0 582     0   0   0    0     0    0   0  5473 529 14088  0   0
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
   0 6521   0   0  0  49     0   0   0    0    14    0   0   450 233     0 172  0 
   0    0   0   0  0   0     0   0   0    0     0    0   0     0   0     0   0  0 
<Active thread>
1348 2627   0   0  0   3 16259   0   1    0     1    0   0 50925  41     0  16  4 

If the bfr column is not increasing then the thread is probably not doing any work i.e. it's stopped. On most UNIX based systems this number can easily increase at over 5000 per second on multi-table queries. It obviously varies between machines but it is a good rule of thumb. Remember a rapid rate of climb only indicates the machine is working hard and not the SQL is efficient.

Find the thread id for a session

While the session controls the user requests the actual work is done by a sqlexec thread (typically).

Command: onstat -g ses <ses_id>

onstat -g ses

INFORMIX-OnLine Version 7.14.UD2 -- On-Line -- Up 00:03:28 -- 147744 Kbytes 

session #RSAM total used 
id       user tty pid hostname threads memory memory 
1294 informix  -   -1    ERIC      1    57344  33860 

tid    name    rstcb    flags    curstk   status 
1322 sqlexec 400276b4   Y--P---    1292   running

Memory pools count 1 
name class   addr   totalsize freesize #allocfrag #freefrag 
1294   V   403ac010   57344      23484    146        11 

name          free   used  name        free   used 
overhead         0     96  scb            0     80
opentable        0   3428  filetable      0    636 
log              0   4168  temprec        0   3132
ralloc       23328      0  gentcb         0    192
ostcb            0   2260  net            0   7636
sqscb            0   9772  rdahead        0    172
hashfiletab      0    276  osenv          0    396
sqtcb           156  1464  fragman        0    152  

Sess   SQL     Current   Iso    Lock     SQL ISAM   F.E.
Id   Stmt Type Database  Lvl    Mode     ERR  ERR   Vers
1294     -       eric     CR   Not Wait   0     0   5.01 

Last parsed SQL statement : 

SELECT AQ.part8, AQ.part7, AQ.part6, AQ.part5,
    AQ.part4, AQ.part3, 
    AQ.part2, AQ.part1, AP.constrname, AP.tabid, AP.constrtype FROM 
    'informix'.sysconstraints AP, informix.sysindexes AQ WHERE AQ.idxname =
    AP.idxname AND AP.tabid = 183 AND AP.constrtype = 'P' 

Where is the next SERIAL value stored

systabinfo.serialv

Personal Tools