New features

There are a rather long list of new features that were introduced in 9 and later in IDS 10 and more recently 11 aka Cheetah that users had not been able to take advantage of while constrained to the features offered in IDS 7.3x. The more significant of these include:

Opaque types

Ability to create business specific data types that the engine does not need to interpret, but which are meaningful to applications or to users. For example your warehouse application could define a type bh_locator type as a structure containing several fields including an item's RFID tag id, bin and shelf. IDS does not need to access the contents of the structure, but an application can load the column's contents directly into a binary data structure and access it.

Distinct types

Ability to define new business specific types which are range restrictions on existing types. Distinct types can be based on built-in or user defined types including row types and opaque types. Distinct types are strongly typed so that one must define type type casts to convert to other standard data types. An example might be a dept code type which is defined as a 9 digit character string which only accepts numeric digits and whose length must be exactly 9 characters long.

User defined types (UDTs)

Ability to create business specific data types with built-in rules controlling the range of values and conversions to other types. An example might include cost_usd, curr_gbp, cost_eur, etc. types that include an understanding of their inherent currency codes and conversion functions between each other that reference a table of conversion factors by date to permit comparisons within the engine and automated conversions between customer currencies and USD. Extending the example, an invoice detail line table could use this cost_usd type for storage while the application can request an automatic conversion to type cost_eur for European customers' invoice display and printing.

User defined casts

Related to UDTs are UDCs or explicit and implicit casts for converting values of one type to another.

User defined routines (UDRs)

The ability to write more efficient and/or more capable procedures in the engine using C and/or Java. These 'host language' routines have access to all server facilities and can be far more complex and therefore more efficient than SPL routines. Along with this came the differentiation of STORED PROCEDURES and STORED FUNCTIONS as SPL routines that do not return inline data and those that do.

User defined aggregate functions

Define a set of UDRs that can be used to initialize, iterate, combine, and complete a aggregation calculation on a set of data. The resulting aggregate 'function' can be used like the built-in SUM(), AVG(), MIN(), and MAX() functions in GROUP BY queries, etc.

Virtual Table Interface (VTI)

The ability to define files and databases external to IDS as tables including defining access methods for dearching, reading, inserting, updating, and deleting data from these 'tables'. The example we discussed is a flat file that is pre-processed into a temp table so that its contents can be filtered and processed into multiple result tables. By defining an external table for the flat file IDS can index the 'table' directly and search it or even include its contents in a INSERT INTO … SELECT … FROM …; statement. This can save the overhead of reading the file into a temporary or permanent staging table and deleting the table and/or its contents later.

Virtual Index Interface (VII)

A set of definitions and function hooks that permit one to define new index methods to fit application requirements. The new RTREE built-in index method for geographic/graphical data is built on this interface and the default BTREE index method has been reimplemented using the VII interface to simplify server code.

Datablades and bladelets

Datablades are Informix's method of extending the capabilities of the IDS instance with business logic, new indexing methods, or even whole new data manipulation capabilities. Bladelets are smaller packages containing a small number of UDRs that improve the usability of the IDS instance. There are already many useful datablades that are available including the CopperEye datablade that uses the VII interface to define a new index type which is highly efficient for highly volatile tables, several full text search datablades, the time-series datablade which permits efficient tables of timedependent data, the node datablade which defines hierarchical relationships between rows in a table, and others. Many bladelets are also available like the Dynamic SQL Bladelet which permits SPL routines to define SQL on the fly and execute it, a regular expression bladelet, etc.

Functional indexes

This is a favorite of mine. This extends the usability of IDS indexes in many ways. IDS indexes are otherwise limited to indexing 32 columns and certain data types cannot be included. With a functional index based on a Java language UDR you can pass up to 255 columns as arguments to the UDR and the functions return is the value which becomes the key in the index Btree. The return can be a compressed version of the original columns or some calculation based on them. As long as the function is invariant and deterministic it can be used.

Row types

The ability to create a collection of datatypes which can be used to define a table column or even an entire table. When you have several tables which will have wholly or partially identical lists of columns row types can be used to simplify administration. The classic example is to define an address row type as two address lines plus city, state, postal code, and country columns and use that as a column in any table that needs to store addresses. This guarantees that the address columns are always the same length, etc.

 Additional built-in types
     * BOOLEAN
         Classic true/false
     * SERIAL8
         64bit extended serial type . a table can contain both a SERIAL and SERIAL8 column.
     * Int8
        64bit integer . note this is a proprietary format which is NOT compatible with
        64bit C/C++ integers (though one could use it to define a distinct type with
        conversion functions written in 'C' to make the conversion).

Smart large objects

   BLOB and CLOB types.

   LVARCHAR - Stores variable-length strings of up to 32,739 bytes

Collection data types

SET, LIST, MULTISET . these are new collection data types which can old multiple values of the same type in a single column. Lists are ordered collections permitting duplicate values, sets and multisets are unordered. Sets do not permit duplicate values, multisets do. The collections can be built of any single built-in type or UDT except SERIAL, SERIAL8, and simple large objects.

Sequences

A more general purpose method of stamping records with a unique value. Similar to Oracle's sequences but more efficient. Much faster than implement this feature oneself.

Ability to combine ER and HDR servers to gain the advantages of both

Roles

(Never remember whether 7.31 had these as we didn't use them.) Simplifies configuring new user privileges.

Special user classes

User classes to perform security and dba functions without access to the informix password.

PAM authentication

OP Classes

Allow you to create functions that indexes use to order and search data within the index, including at build time. User defined op classes can be used to support application specific key ordering.

XA Datasource management

IDS 10+ can coordinate XA transactions affecting multiple databases in unlike servers (ie Oracle, DB2, etc) that support XA protocols or that are managed by a transaction manager like Tivoli using UDRs.

DESCRIBE INPUT

A new statement that allows one to describe the input parameters of an UPDATE or DELETE statement that was prepared with replaceable parameters.

External optimizer directives

The ability to apply optimizer directives to the queries an application launches without having to recode/compile/link the application. Good for quick fixes and testing optional directives.

SET DEFERRED PREPARE

This can reduce the overhead of large numbers of prepare statements that are rarely opened. It causes a task to delay actually preparing the statement until the time the statement is executed or a related cursor is OPENed. This reduces the number of back and forth communication steps reducing the overhead in the engine. Also statements that are never executed are never actually prepared. Very useful with applications that are produced by code generator output (ala FourGen).

User Virtual Processors

Permits you to isolate UDRs and datablades to purpose specific VPs so that bugs and performance problems in these user developed features cannot adversely affect unrelated users.

Note that unlike the extension libraries available for some other database systems, any of these features that are not built-in are incorporated into the core of the IDS engine so that their use does not detract for server performance and they can be isolated in 'user' class VPs so that bugs in extensions cannot cause general server crashes. Once such code has been thoroughly debugged it can be deployed in the standard CPU VPs or in multiple user VPs to gain from server parallelization.

Index rebuild using larger page sizes:

Mark Scranton makes a very convincing case that indexes are more efficient on larger pages. The anecdotal evidence to date seems to bear him up. Most sites that have tried the trick have experienced noticeable improvement in indexed query performance. Whether this is because the indexes are flatter and so require fewer IOs to access a particular key leaf or because the index pages end up isolated in their own buffer cache apart from the data pages, thereby reducing contention for buffers, is not clear. I suspect that there is a lot of both causing the effect. I am a fan of isolating buffers by purpose to eliminate intra-application contention.


Personal Tools