11. Virtual database configuration

11.1. Writing a Virtual Database Configuration File

The C-JDBC controller configuration file must conform to the C-JDBC DTD that can be found in the xml directory of the C-JDBC distribution. The DTD is extensively documented and the most up-to-date information will be found in the xml/c-jdbc-x.y.dtd file. Several configuration file examples are available in the doc/examples directory.

Here is an example of how a minimal C-JDBC configuration file should look like:

<?xml version="1.0" encoding="UTF8"?>

<!DOCTYPE C-JDBC PUBLIC "-//ObjectWeb//DTD C-JDBC 1.0//EN"
                        "http://www.objectweb.org/c-jdbc/dtds/c-jdbc-2.0.2.dtd">

<C-JDBC> <VirtualDatabase name="vdbName">
<AuthenticationManager> ...  </AuthenticationManager>

    <DatabaseBackend name="node1" driver="com.myDriver.class"
                     url="jdbc:protocol://host/myDB" connectionTestStatement="select 1">
       ...
    </DatabaseBackend>

    <RequestManager>
      <RequestScheduler>
         ...
      </RequestScheduler>

      <LoadBalancer>
         ...
      </LoadBalancer>
    </RequestManager>
  </VirtualDatabase>
</C-JDBC>
      

The next sections describes the different elements composing an XML configuration file.

11.2. Virtual Database

A virtual database element is defined as follows:

<!ELEMENT VirtualDatabase (Distribution?, Monitoring?, Backup?,AuthenticationManager, DatabaseBackend+, RequestManager)>     
<!ATTLIST VirtualDatabase 
       name               CDATA #REQUIRED
       maxNbOfConnections CDATA #IMPLIED
       poolThreads        (true | false) "true"
       minNbOfThreads     CDATA #IMPLIED
       maxNbOfThreads     CDATA #IMPLIED
       maxThreadIdleTime  CDATA #IMPLIED
       sqlDumpLength      CDATA "40"
>
      

A virtual database is the database exposed to the user. It contains:

  • a set of real database backends,

  • an authentication manager that matches the virtual database and real backends login/password,

  • a request manager that defines the behavior of the controller for this virtual database,

Here is a brief description of each virtual database attribute:

  • name: name of the virtual database to be used in the JDBC URL (jdbc:cjdbc://host/VirtualDatabaseName).

  • maxNbOfConnections: maximum number of concurrent connections accepted for this virtual database. The controller stops accepting client connections when maxNbOfConnections concurrent connections are running. Default is 0 (no limit).

  • poolThreads: if false, one thread is created for each connection and dies when the connection closes. If set to true, threads are created on-demand and kept in a pool to be reused to serve multiple connections. Default is true.

  • minNbOfThreads: minimum number of threads to keep in the pool (if poolThreads is set to true). Default is 0.

  • maxNbOfThreads: maximum number of threads in the pool (if poolThreads is set to true). Default is 0 (no limit).

  • maxThreadIdleTime: maximum time in seconds a thread can remain idle before being removed from the pool. Default is 60 seconds (a thread that has not serve any request in the past 60 seconds will be killed).

  • sqlDumpLength: maximum number of characters of a SQL statement to display in traces and exception messages. 0 means no limit and the full statement is inserted in the message (be careful especially if you are using large objects. Default is 40.

11.2.1. Distribution

A Distribution element defines the virtual database distribution rules so that the virtual database can be shared by several C-JDBC controllers (feature called horizontal scalability). A Distribution element is defined as follows:

  • groupName: group name to be used by the JGroups communication layer. If no name is provided, the virtual database name is used instead.

  • macroClock: if a request contains a date macro that needs to be replaced, this attribute defines how the current time is determined. If macroClock is set to none, then each controller uses its local timestamp for all of its backends. If macroClock is set to local, then the macros are replaced using the local time before being sent to remote controllers. Note that in all cases, it is preferable to have controllers synchronized using NTP or an equivalent mechanism.

  • castTimeout: is the timeout to use when sending request to the group.

<!ELEMENT Distribution>
<!ATTLIST Distribution
    groupName  CDATA #IMPLIED
	macroClock (none | local) "local"
	castTimeout CDATA "0"
>
      

groupName defines the group name to be used by the JGroups communication layer. If no name is provided, the virtual database name is used as the default group name.

[Note]Note

The JGroups stack configuration is defined in config/jgroups.xml. Refer to the JGroups documentation if you want to alter the JGroups configuration.

[Note]Note

When a controller fails, all backends attached to it are automatically disabled. A full recovery process is then needed for these nodes. To learn more about this issue, read the horizontal scalability design document.

11.2.2. Monitoring

[Note]Note

CAREFUL! Monitoring can possibly lead to a memory leak and should only be used on a short period of time There is also a JMX method on the VirtualDatabaseMBean to set this on and off while online:

			void setMonitoringToActive(boolean active) throws VirtualDatabaseException;
		

Monitoring provides a generic section for different monitoring modules. At the moment, only “SQLMonitoring” is provided.

<!ELEMENT Monitoring (SQLMonitoring*)>

<!ELEMENT SQLMonitoring (SQLMonitoringRule*)>
<!ATTLIST SQLMonitoring
    defaultMonitoring (on | off) "on"
>
    

SQL Monitoring provides statistics (count, error, cache hits, timing) for SQL queries. It is possible to define rules to turn monitoring on or off for specific query patterns.

defaultMonitoring: defines the default rule if a request should be monitored (on) or not (off) if no specific rule matches the request.

<!ELEMENT SQLMonitoringRule EMPTY>
<!ATTLIST SQLMonitoringRule
    queryPattern    CDATA #REQUIRED
    caseSensitive   (true | false) "false"
    applyToSkeleton (true | false) "false"
    monitoring      (on | off) "on"
>
    

A SQLMonitoringRule Defines a specific monitoring rule for all queries that match the given pattern.

  • queryPattern: a regular expression understood by the Jakarta Regexp API. For more information on Regexp format, go to Jakarta Regexp web site.

  • caseSensitive: true if the pattern matching must be case sensitive.

  • applytoSkeleton: true if the pattern must apply to the query skeleton (found in PrepareStatement), false if the instantiated query should be used. Example: - skeleton: SELECT * FROM t WHERE x=? - instantiated query: SELECT * FROM t WHERE x=12

  • monitoring: on to activate the monitoring for this rule, off to disable it.

Examples:

  • <SQLMonitoring queryPattern="^delete" monitorRequest="off"/> will turn monitoring off for all delete queries.

  • <SQLMonitoring queryPattern="select * from users *" monitorRequest="on"/> will turn monitoring on for all select queries on the users table.

    [Note]Note

    !Warning! This is different from <SQLMonitoring queryPattern="select \* from users *" monitorRequest="on"/> which turns monitoring on for the "select * from users ..." kind of queries.

11.3. Backup Manager

A Backup Manager defines a number of Backuper in charge of performing backup/restore operations on backends. This element is defined as follows:

<!ELEMENT Backup (Backuper+)>

<!ELEMENT Backuper EMPTY>
<!ATTLIST Backuper
    backuperName CDATA #REQUIRED
    className    CDATA #REQUIRED
    options      CDATA #IMPLIED
>
      

A Backuper is defined by a logical backuperName used by the administration console when performing a backup operation. The className specifies the implementation of the Backuper. Backuper specific options can be provided as well (this can be the path to a properties files or a set of attributes). Check your Backuper documentation for its specific options.

Here is an example to use the Octopus Backuper for a virtual database:

<Backup>
  <Backuper backuperName="Octopus" className="org.objectweb.cjdbc.controller.backup.OctopusBackuper"/>
</Backup>      
    
[Note]Note

Octopus does not have access to the C-JDBC classloader for drivers and therefore it needs database drivers to be accessible from the controller classpath. A good solution is to unjar the drivers in the drivers/ directory of C-JDBC.

Octopus dumps are by default stored in a compressed .zip format.

11.4. Authentication Manager

An authentication manager element is defined as follows:

<!ELEMENT AuthenticationManager (Admin+, VirtualUsers)>

<!ELEMENT Admin (User+)>

<!ELEMENT User EMPTY>
<!ATTLIST User
    username CDATA #REQUIRED
    password CDATA #REQUIRED
>

<!ELEMENT VirtualUsers (VirtualLogin+)>

<!ELEMENT VirtualLogin (TrustedLogin*)>
<!ATTLIST VirtualLogin 
    vLogin    CDATA #REQUIRED
    vPassword CDATA #REQUIRED
>
<!ELEMENT TrustedLogin EMPTY>
      

An authentication manager defines:

  1. an administrator login to be used by the console to access the virtual database administration menu (see Section 7.5, “Administrator Menu”) that allows enabling the backends,

  2. "virtual logins" that are used by the client application and that are mapped to "real logins" for each backend.

  3. "trusted logins" will be used in the future to allow reusing other form of authentication from within C-JDBC.

Here is an example of an authentication manager definition:

    <AuthenticationManager>
      <Admin>
      	<User username="admin" password="adminPwd"/>
      </Admin> 
      <VirtualUsers>
        <VirtualLogin vLogin="user1" vPassword="userPwd1"/>
        <VirtualLogin vLogin="user2" vPassword=""/>
      </VirtualUsers>
    </AuthenticationManager>
      

In this example, the virtual database has one administrator. The admin can use the login/password "admin/adminPwd" to log in the console.

Two virtual logins are defined: user1 and user2 with userPwd1 and no password, respectively. These logins are those used by the client application and given to the C-JDBC driver.

The connection manager to use with each of the virtual has to be defined in the DatabaseBackend section. Each DatabaseBackend has to define a pool connection manager for each of the virtual user specified here.

11.5. Database Backend

Each database backend must be given a unique name (it is a logical name but it is convenient to use the same name as the real machine name). The database schema is automatically gathered from the backend when it is added to the virtual database. However, you can specify a static database schema (refer to Section 11.5.2, “Database Schema Definition”) to be used instead. Finally, a specific connection manager (see Section 11.5.3, “Connection Manager”) defines the connection pooling strategy for each virtual login on each backend.

A database backend element is defined as follows:

<!ELEMENT DatabaseBackend (DatabaseSchema?, RewritingRule*, ConnectionManager+)>
<!ATTLIST DatabaseBackend 
    name                    CDATA #REQUIRED
    driver                  CDATA #REQUIRED
    driverPath              CDATA #IMPLIED
    url                     CDATA #REQUIRED
    connectionTestStatement CDATA #REQUIRED  
>
      

Here is a brief description of database backend attributes:

  • name: the unique logical name identifying this backend.

  • driver: the database native JDBC driver class name.

  • driverPath: name of the directory or jar file containing the native driver files. If driverPath is omitted, the driver must be in the drivers/ directory. If several driver jar files are in the same directory, the first jar file containing the class name specified in the driver attribute is used. Note that drivers are loaded in separate classloaders which allows you to use different versions of the same driver on different backends just by specifying the right jar file.

  • url: the JDBC URL to connect to this database backend.

  • connectionTestStatement: SQL statement to send on a connection to check if the connection is still valid. This is used when C-JDBC suspects a connection to be broken after the failure of a request. This statement should not update the database because if the connection is still valid the database state should remain the same. Here are the settings for the most popular databases:

    • for MySQL use select 1

    • for PostgreSQL use select now().

    • for Apache Derby use values 1.

    • for HSQL use call now().

    • for SAP DB (MySQL MaxDB) use select count(*) from versions.

    • for Oracle use select * from dual.

    • for Firebird use select * from rdb$types.

    • for InstantDB use set date format "yyyy/mm/dd".

    • for Interbase use select * from rdb$types.

    • for Microsoft SQL server 2000 select 1.

Here is a complete example of a database backend element including its connection manager definition:

<DatabaseBackend name="node1" driver="org.gjt.mm.mysql.Driver"
      url="jdbc:mysql://node1.objectweb.org/rubis" connectionTestStatement="select 1">
  <ConnectionManager vLogin="user1" rLogin="ruser1" rPassword="rpass1">
    <SimpleConnectionManager/>
  </ConnectionManager>
  <ConnectionManager vLogin="user2" rLogin="ruser1" rPassword="rpass1">
    <VariablePoolConnectionManager initPoolSize="10" 
                                   minPoolSize="5"
                                   maxPoolSize="100"/>
  </ConnectionManager>
</DatabaseBackend>
      

11.5.1. Rewriting requests on backends

If your cluster is made of database engines from different vendors, client requests might not be understood by all database backends. If your application was written for PostgreSQL and you want to add MySQL backends, some request might have to be adapted to execute correctly on MySQL. You can specify rules to rewrite queries on the fly on a specific backend. A RewritingRule element defines how a query matching a given pattern should be rewritten.

<!ELEMENT RewritingRule EMPTY>
<!ATTLIST RewritingRule 
    queryPattern  CDATA #REQUIRED
    rewrite       CDATA #REQUIRED
    matchingType  (simple | pattern) "simple"
    caseSensitive (true | false) "false"
    stopOnMatch   (true | false) "false"
>
    
  • queryPattern: SQL query pattern to match.

  • rewrite: rewritten SQL query.

  • matchingType: simple: means that the first occurrence of queryPattern in the request will be replaced by the string specified in rewrite. pattern: uses a pattern based match/replace. A pattern uses ?x where x is a logical number assigned to the pattern. Example: select ?1 from ?2 where x=?3.

  • caseSensitive: true if matching must be case sensitive.

  • stopOnMatch: rules are applied in the order they are defined. If one rule matches and stopOnMatch is set to true, next rules are ignored. If stopOnMatch is set to false, if another rule matches the rewritten query, the query will be rewritten again.

Examples:

 <RewritingRule queryPattern="from user" rewrite="from "user;"" 
    matchingType="simple"/>

will rewrite the query select * from user where x=y as select * from "user" where x=y.

 <RewritingRule queryPattern="select * from t where x=?1" 
    rewrite="select x from y where y=?1" matchingType="pattern"/>

will rewrite the query select * from t where x=435 to select x from y where y=435

 <RewritingRule queryPattern="?1 LIMIT ?2,?3" rewrite="?1 LIMIT ?3,?2" 
    matchingType="pattern"/>

will rewrite the query select * from t limit 10,20 to select * from t limit 20,10

11.5.2. Database Schema Definition

DatabaseSchema groups static and dynamic definitions for gathering, constructing and validating the in-memory schema used for load balacing and caching.

A Database schema is defined as follow

<!ELEMENT DatabaseSchema (DatabaseStaticSchema?)>
<!ATTLIST DatabaseSchema
 dynamicPrecision (static|table|column|procedures|all) "all"
 gatherSystemTables (true | false) "false"
 schemaName CDATA #IMPLIED
>
       
  • dynamicSchemaPrecision: if set to static, the controller will not check schemas and stored procedures, it will entirely rely on the statically defined schema. If set to something else than "static" it will get information from the backend to check validity of static schema at given level. table level will check for table names only, column level will check for column names, procedures will gather all executable stored procedures. All, includes all information that can be collected.

  • gatherSystemTables: true if system tables and views should be retrieved, false otherwise (default).

  • schemaName: if no schemaName is specified all objects visible to the user are gathered, otherwise only the objects belonging to the specified schema are used.

[Note]Note

Default option for constructing the schema is to collect all information, even if a static schema is defined especially for checking validity of input. This can be really slow if the database has quite a number of stored procedures defined.

A static database schema can be defined to override the schema automatically gathered by the controller. However, the schema must remain compatible with the schema gathered from the backend.

A database schema element is defined as follows:

<!ELEMENT DatabaseStaticSchema (DatabaseProcedure*,DatabaseTable+)>

<!ELEMENT DatabaseProcedure (DatabaseProcedureParameter*)>
<!ATTLIST DatabaseProcedure 
     name       CDATA #REQUIRED
     returnType (resultUnknown | noResult | returnsResult) "resultUnknown"
>

<!ELEMENT DatabaseTable (DatabaseColumn+)>
<!ATTLIST DatabaseTable 
    tableName   CDATA #REQUIRED
    nbOfColumns CDATA #REQUIRED
>

<!ELEMENT DatabaseColumn EMPTY>
<!ATTLIST DatabaseColumn 
    columnName CDATA #REQUIRED
    isUnique   (true | false) "false"
>
      

The isUnique attribute should be set to true if the column has a UNIQUE constraint. This is the case for primary keys (composed primary keys are not yet supported). This affects only cache behavior and select statements parsing.

Here is an example of a database schema definition:

<DatabaseStaticSchema>
   <DatabaseTable tableName="users" nbOfColumns="10">
      <DatabaseColumn columnName="id" isUnique="true"/>
      <DatabaseColumn columnName="firstname" isUnique="false"/>
      <DatabaseColumn columnName="lastname" isUnique="false"/>
      <DatabaseColumn columnName="nickname" isUnique="false"/>
      <DatabaseColumn columnName="password" isUnique="false"/>
      <DatabaseColumn columnName="email" isUnique="false"/>
      <DatabaseColumn columnName="rating" isUnique="false"/>
      <DatabaseColumn columnName="balance" isUnique="false"/>
      <DatabaseColumn columnName="creation_date" isUnique="false"/>
      <DatabaseColumn columnName="region" isUnique="false"/>
   </DatabaseTable>

   <DatabaseTable tableName="regions" nbOfColumns="2">
     <DatabaseColumn columnName="id" isUnique="true"/>
     <DatabaseColumn columnName="name" isUnique="false"/>
   </DatabaseTable>
</DatabaseStaticSchema>
      

11.5.3. Connection Manager

One connection manager must be defined for each virtual login the backend belongs to. The user/password combination defined in the RealLogin element for this vLogin/Backend is used to create the connection manager. An example of a connection manager definition is available in Section 11.5, “Database Backend”.

C-JDBC offers several connection managers that are described hereafter:

  • SimpleConnectionManager: basic connection manager that opens a new connection on each request and closes it at the end. It is useful if the underlying driver already implements connection pooling for example.

  • FailFastPoolConnectionManager: offers connection pooling and fails fast when the pool is empty. poolSize is the size of the pool.

    All connections are initialized at startup time and if the pool size is too large it is adjusted to the largest number of connections available. Once the pool is empty, null is returned instead of a connection. Therefore incoming requests will fail until at least one connection is freed. No system overload should occur with this connection manager, but if the pool size is too small, many requests will fail.

  • RandomWaitPoolConnectionManager: provides connection pooling and wait when the pool is empty until a connection is freed. This connection manager accepts the following attributes:

    • poolSize: this is the size of the pool.

    • timeout: this is the maximum time in seconds to wait for a connection to be freed. Default is 0 and means no timeout, that is to say that we wait until one connection is freed.

    All connections are initialized at startup time and if the pool size is too large it is adjusted to the largest number of connections available. Once the pool is empty, the requests wait until a connection is freed or the specified timeout has elapsed. The FIFO[3] order of connection request is not ensured by this connection manager since it relies on the Java wait/notify mechanism.

  • VariablePoolConnectionManager: provides connection pooling with a dynamically adjustable pool size. This connection manager accepts the following attributes:

    • initPoolSize: initial pool size to be intialized at startup.

    • minPoolSize: minimum number of connections to keep in the pool. Default is equal to initPoolSize.

    • maxPoolSize: maximum number of connections in this pool. Default is 0 and means no limit.

    • idleTimeout: time in seconds a connection can stay idle before being released (removed from the pool). Default is 0 and means that once allocated, connections are never released.

    • waitTimeout: this is the maximum time in seconds to wait for a connection to be freed. Default is 0 and means no timeout, that is to say that we wait until one connection is freed.

The connection manager element complete definition is as follows:

<!ELEMENT ConnectionManager (SimpleConnectionManager |
                             FailFastPoolConnectionManager |
                             RandomWaitPoolConnectionManager |
                             VariablePoolConnectionManager)>
<!ATTLIST ConnectionManager 
    Login CDATA #REQUIRED
>

<!ELEMENT SimpleConnectionManager EMPTY>

<!ELEMENT FailFastPoolConnectionManager EMPTY>
<!ATTLIST FailFastPoolConnectionManager
    poolSize CDATA #REQUIRED
>

<!ELEMENT RandomWaitPoolConnectionManager EMPTY>
<!ATTLIST RandomWaitPoolConnectionManager
    poolSize CDATA #REQUIRED
    timeout  CDATA #IMPLIED
>

<!ELEMENT VariablePoolConnectionManager EMPTY>
<!ATTLIST VariablePoolConnectionManager
    initPoolSize CDATA #REQUIRED
    minPoolSize  CDATA #IMPLIED
    maxPoolSize  CDATA #IMPLIED
    idleTimeout  CDATA #IMPLIED
    waitTimeout  CDATA #IMPLIED
>
      

11.6. Request Manager

The request manager is composed of a scheduler (see Section 11.6.2, “Request Scheduler”), an optional query cache (see Section 11.6.3, “Request Cache”), a load balancer (see Section 11.6.4, “Load Balancer”) and an optional recovery log (see Section 11.6.5, “Recovery Log”).

If requests need to be parsed, it can be done sequentially when needed (backgroundParsing is set to false which is the default value) or forced to be performed in background by a separate thread (it means a new thread is created for each request that need to be parsed).

Parsing is by default case insensitive (caseSensitiveParsing is set to false) which means that table and column names will be matched to the database schema without checking the case. If you want to enforce the parsing to be case sensitive and reject queries that do not use the same case for table and column names as the ones fetched from the database, set caseSensitiveParsing to true.

A timeout in seconds can be defined for begin/commit/rollback operations. If no value is given, the default timeout is set to 60 seconds. Warning! A value of 0 means no timeout and waits forever until completion.

The request manager element definition is as follows:

<!ELEMENT RequestManager (RequestScheduler, RequestCache?, LoadBalancer, RecoveryLog?)>
<!ATTLIST RequestManager
    backgroundParsing    (true | false) #IMPLIED
    caseSensitiveParsing (true | false) #IMPLIED
    beginTimeout      CDATA #IMPLIED
    commitTimeout     CDATA #IMPLIED
    rollbackTimeout   CDATA #IMPLIED
>
      

11.6.1. Macros Handler

C-JDBC can interpret and replace on-the-fly macros with a value computed by the controller (the RequestManager in fact). This prevents different backends to generate different values when interpreting the macros which could result in data inconsitencies. The supported macros are the following:

  • rand: RAND() can be replaced with an int, long, float or double value.

all the date macros (now, currentDate, currentTime, timeOfDay and currentTimestamp) can be replaced by one of the following:

  • off: do not replace the macro

    date: java.sql.Date.toString() build from current time at controller (example: 2001-02-17)

    time: java.sql.Time.toString() build from current time at controller (example: 19:07:32).

    timestamp: java.sql.Timestamp.toString() build from current time at controller (example: 2001-02-17 19:07:32-05).

  • timeResolution:defines the timer precision to use when rewriting a query that contains a date macro. Default is 0 millisecond which is the highest precision. A value of 1000 corresponds to a 1 second precision, 60000 to a 1 minute precision and so on.

The MacroHandling element definition is as follows:

 
<!ELEMENT MacroHandling EMPTY> 
<!ATTLIST MacroHandling 
        rand (off | int | long | float | double) "float" 
        now  (off | date | time | timestamp) "timestamp" 
        currentDate (off | date | time | timestamp) "date" 
        currentTime (off | date | time | timestamp) "time" 
        timeOfDay (off | date | time | timestamp) "timestamp" 
        currentTimestamp (off | date | time | timestamp) "timestamp" 
        timeResolution CDATA "0" 
> 
         
[Note]Note

A default Macrohandling element is instanciated and used if nothing is specified in the configuration file.

11.6.2. Request Scheduler

The request scheduler is responsible for scheduling the requests and ensuring a serializable execution order. Different schedulers are provided for each RAIDb level (see Section 10, “RAIDb Basics”). Optimized schedulers are also provided for use with a single database backend (SingleDB configuration).

The request scheduler element definition is as follows:

<!ELEMENT RequestScheduler (SingleDBScheduler | RAIDb-0Scheduler |
                            RAIDb-1Scheduler | RAIDb-2Scheduler)>

<!ELEMENT SingleDBScheduler EMPTY>
<!ATTLIST SingleDBScheduler
    level (passThrough | pessimisticTransaction) #REQUIRED
>

<!ELEMENT RAIDb-0Scheduler EMPTY>
<!ATTLIST RAIDb-0Scheduler 
    level (passThrough | pessimisticTransaction) #REQUIRED
>

<!ELEMENT RAIDb-1Scheduler EMPTY>
<!ATTLIST RAIDb-1Scheduler
    level (passThrough | query | optimisticQuery | optimisticTransaction |
           pessimisticTransaction) #REQUIRED
>

<!ELEMENT RAIDb-2Scheduler EMPTY>
<!ATTLIST RAIDb-2Scheduler
    level (passThrough | query | pessimisticTransaction) #REQUIRED
>
	

Here is a brief definition of the meaning of each scheduler:

  • passThrough: queries are just assigned a unique idnetifier and forwarded as is to the load balancer letting each database perform the scheduling and the locking. Therefore you will obtain the locking granularity provided by the database which should be row-level locking. The load balancer will just ensure that the writes are sent in the same order to all backends.

    query: the query level scheduler is the most basic scheduler that ignores transactions and schedule requests as they come. Reads can execute concurrently but a write blocks every other request execution until it completes. This scheduling might lead to database deadlocks under moderate load. Only use this scheduler if the requests are already scheduled upfront by the application.

  • optimisticTransaction: this is an optimistic transactional level scheduler that takes care of scheduling transactions in an optimistic way (writes on different tables can occur in parallel). An optimistic scheduler does not provide deadlock detection to rollback transactions that are deadlocking, but lets the database resolve the deadlocks and rollback the appropriate transactions.

  • pessimisticTransaction: this is a pessimistic transactional level scheduler that schedules transactions in a safe way (without possible deadlocks) but providing less parallelism for writes compared to optimistic scheduling (this is only sensitive on write heavy workloads). This is the safest scheduler and it is recommended for most configurations.

11.6.3. Request Cache

A Request Cache can be composed of different caches that differ in the type of data they cache:

  • MetadataCache: this cache improves the ResultSet creation time by keeping the various field information with their metadata. It is strongly encouraged to use this cache that reduces both cpu and memory usage.

  • ParsingCache: allows to parse a request only once for all its executions. This reduces the cpu load on the controller.

  • ResultCache: this cache keeps the results associated to a given request. Cache entries can be invalidated according to various policies. This cache reduces the load on the database backends.

A RequestCache element is defined as follows:

<!ELEMENT RequestCache (MetadataCache?, ParsingCache?, ResultCache?)>
        
11.6.3.1. Metadata Cache

The MetadataCache caches ResultSet metadata and fields meta information associated to a query execution so that each time a query is executed, we don't have to gather all metadata from the underlying driver and we can build C-JDBC ResultSet much more efficiently both in terms of speed and memory usage. Note that if you use PreparedStatements, the query skeleton is used for matching the cache instead of the instanciated query.

Example: SELECT * FROM t WHERE x=? hits on the same cache entry for all queries of this form for any value of x.

A MetadataCache element is defined as follows:

<!ELEMENT MetadataCache EMPTY>
<!ATTLIST MetadataCache
    maxNbOfMetadata CDATA "10000"
    maxNbOfField    CDATA "0"
>            
          

maxNbOfMetadata: maximum number of metadata entries to keep in the cache (default is 10000 and 0 means unlimited)

maxNbOfField: maximum number of field entries to keep in the cache (0 means unlimited and is the default setting).

11.6.3.2. Parsing Cache

Parsing requests is a resource consuming process. The ParsingCache caches the result of the parsing processing so that a request is only parsed once for all its executions. If you are using PreparedStatements, the ParsingCache can store the query skeleton meaning that the cached parsing will match any instances of the skeleton.

Example: SELECT * FROM t WHERE x=? will be parsed only once for any value of x.

A MetadataCache element is defined as follows:

<!ELEMENT ParsingCache EMPTY>
<!ATTLIST ParsingCache
    backgroundParsing (true | false) "false"
    maxNbOfEntries    CDATA "5000"       
>            
          

Request parsing can be done sequentially when needed (backgroundParsing set to false which is the default value) or forced to be performed in background by a separate thread (it means a new thread is created for each request that need to be parsed).

maxNbOfEntries: Defines the maximum number of entries to keep in the cache. The cache uses a LRU (Least Recently Used) replacing policy meaning that the oldest entries from the cache are removed when it is full. Default is 0 and means no limit on cache size.

11.6.3.3. Result Cache

The ResultCache caches results of queries. A query and its ResultSet are stored in the cache so that if the sames query is executed, the ResultSet stored in the cache is returned.

ResultCacheRule elements define the cache coherency and policy. Default cache behavior is eager consistency for all queries (ResultSet returned by the cache are always coherent and up-to-date). See below (ResultCacheRule element) to relax the cache consistency to achieve better performance. Note that ResultSet caching is disabled if no result cache element is found in the configuration file

If two exact same requests are to be executed at the same time, only one is executed and the second one waits until the completion of the first one (this is the default pendingTimeout value which is 0). To prevent the second request from waiting forever, a pendingTimeout value in seconds can be defined for the waiting request. If the timeout expires, the request is executed in parallel with the first one.

A result cache element is described as follows:

<!ELEMENT ResultCache (DefaultResultCacheRule?, ResultCacheRule*)>
<!ATTLIST ResultCache 
    granularity   (database | table | column | columnUnique) "database"
    maxNbOfEntries CDATA "100000"
    pendingTimeout CDATA "0"
>
        

The result cache granularity defines how entries are removed from the cache. database flush the whole cache on every write access. This is the default cache setting. table and column provide table-based and column-based invalidations, respectively. columnUnique can optimize requests that select a unique primary key (useful with EJB entity beans).

You can specify the maximum number of entries (default is 100000) to limit the cache size. This is obviously not as efficient as setting a cache size, but in the latter case we would have to spend a lot of time computing size of result sets from queries (Java does not provide a sizeof operator!). We offer size display in bytes when viewing the cache from the console though.

Finer grain tuning of the cache is based on rules matching query pattern. A queryPattern are regular expressions to match according to Jakarta Regexp (see their web site for more information). A default cache rule defines the policy if no other rule matches:

	<!ELEMENT DefaultResultCacheRule (NoCaching | EagerCaching | RelaxedCaching)>
	<!ATTLIST DefaultResultCacheRule
	    timestampResolution CDATA "1000"
	>
	
	<!ELEMENT ResultCacheRule (NoCaching | EagerCaching | RelaxedCaching)>
	<!ATTLIST ResultCacheRule
	    queryPattern        CDATA #REQUIRED
	    caseSensitive       (true | false) "false"
	    applyToSkeleton     (true | false) "false"
	    timestampResolution CDATA "1000"
	>
        
  • queryPattern: the regular expression to match.

  • caseSensitive: true if the pattern matching must be case sensitive

  • applytoSkeleton: true if the pattern must apply to the query skeleton (found in PrepareStatement), false if the instantiated query should be used. Example: skeleton is SELECT * FROM t WHERE x=? and instantiated query is SELECT * FROM t WHERE x=12.

  • timestampResolution: If a query contains a NOW() macro, it is replaced with the current date on the controller. timestampResolution indicates the resolution (in milliseconds) to use when replacing NOW() with the current date. If the resolution is below 1 second (value <1000ms), the request is never kept in the cache because there is almost no chance that the same request will come with the same timestamp. Note that this timestamp is for the cache only and you can use a greater resolution for the load balancer (see below).

[Note]Note

If timestampResolution is set to 60000, every execution of a query like SELECT * FROM x WHERE date=NOW() will be replaced with the same value for 1 minute (i.e. SELECT * FROM x WHERE date="2012-11-15 08:03:00.000") and therefore the cache entry may be hit for 1 minute.

To define a default rule that disable caching use:

<DefaultResultCacheRule>
  <NoCaching/>
</DefaultResultCacheRule>
	

If no default rule is provided, the following default rule is assumed:

<ResultCacheRule queryPattern="default" timestampResolution="1000">
  <EagerCaching/>
</ResultCacheRule>
	

Each cache rule can have a different caching behavior. The available behavior are the following:

<!ELEMENT NoCaching EMPTY>
<!ELEMENT EagerCaching EMPTY>
<!ELEMENT RelaxedCaching EMPTY>
<!ATTLIST RelaxedCaching 
     timeout        CDATA "60"
     keepIfNotDirty (true | false) "true"
>
	
  • NoCaching means we do not put the match in the cache

  • EagerCaching means that all entries in the cache are always coherent and any update query (insert,delete,update,...) will automatically invalidate the corresponding entry in the cache. This was the previous cache behavior for all queries

  • RelaxedCaching means that a timeout is set for this entry and the entry is kept in the cache until the timeout expires. When the timeout expires, if no write has modified the corresponding result and keepIfNotDirty is set to true, the entry is kept in the cache and the timeout is rearmed (reset) with its initial value.

[Note]Note

RelaxedCaching may provide stale data. The timeout defines the maximum staleness of a cache entry. It means that the cache may return an entry that is out of date.

Here is a cache rule example:

<ResultCacheRule queryPattern="select ? from b where id=?" applyToSkeleton="true">
  <RelaxedCaching timeOut="6000" keepIfNotDirty="true"/>
</ResultCacheRule>
	

11.6.4. Load Balancer

The load balancer defines the way requests will be distributed among the backends according to a RAIDb level (see Section 10, “RAIDb Basics”). The following load balancers are available:

  • SingleDB: load balancer for a single database backend instance. This is only available if you use a single controller.

  • ParallelDB: load balancer to use with a parallel database such as Oracle Parallel Server or Middle-R. Both read and write are load balanced on the backends, letting the parallel database replicating writes.

  • RAIDb-0: full database partitioning (no table can be replicated) with an optional policy specifying where new tables are created.

  • RAIDb-1: full database mirroring (all tables are replicated everywhere) with an optional policy specifying how distributed queries (writes/commit/rollback) completion is handled (when the first, a majority or all backends complete).

  • RAIDb-1ec: full database mirroring (like RAIDb-1) with error checking for byzantine failure detection.

  • RAIDb-2: partial replication (each table must be at least replicated once) with optional policies for new table creation (like RAIDb-0) and distributed queries completion (like RAIDb-1).

  • RAIDb-2ec: partial replication (like RAIDb-2) with error checking for byzantine failure detection.

The load balancer element definition is as follows:

<!ELEMENT LoadBalancer (SingleDB | ParallelDB | RAIDb-0 | RAIDb-1 | RAIDb-1ec | RAIDb-2 | RAIDb-2ec)>
	
11.6.4.1. SingleDB load balancer

The SingleDB load balancer does not need any specific parameter. The definition of the SingleDB element is as follows:

 <!ELEMENT SingleDB EMPTY> 
11.6.4.2. ParallelDB load balancer

The ParallelDB load balancer must be used with a SingleDB request scheduler. This load balancer provides two implementations: ParallelDB-RoundRobin and ParallelDB-LeastPendingRequestsFirst providing round robin and least pending request first load balancing policies, respectively. ParallelDB load balancers are designed to provide load balancing and failover on top of parallel databases such as Oracle Parallel Server or Middle-R. It means that read and write requuest are just sent to one alive backends, the parallel database being responsible for maintaining the consistency between the backends. The definition of the ParallelDB element is as follows:

<!ELEMENT ParallelDB (ParallelDB-RoundRobin | ParallelDB-LeastPendingRequestsFirst)>

<!ELEMENT ParallelDB-RoundRobin EMPTY>
<!ELEMENT ParallelDB-LeastPendingRequestsFirst EMPTY>
          

No specific settings are required for these load balancers. They do not require request parsing which means that requests are just forwarded as is to the backends (rewriting rules are still applied but no automatic transformation is performed).

11.6.4.3. RAIDb-0 load balancer

The RAIDb-0 load balancer accepts a policy to specify where new tables are created. The definition of the RAIDb-0 element is as follows:

<!ELEMENT RAIDb-0 (MacroHandling?,CreateTable*)>

<!ELEMENT CreateTable (BackendName*)>
<!ATTLIST CreateTable 
    tableName     CDATA #IMPLIED
    policy        (random | roundRobin | all) #REQUIRED
    numberOfNodes CDATA #REQUIRED
>
    
<!-- BackendName simply identifies a backend by its logical name -->
<!ELEMENT BackendName EMPTY>
<!ATTLIST BackendName
    name CDATA #REQUIRED
>

If MacroHandling is omitted, a default MacroHandling element is added.

CreateTable defines the policy to adopt when creating a new table. This policy is based on the given list of BackendName nodes (which might be a subset of the complete set of backends). If the backend list is omitted, then all enabled backends are taken at decision time. The attributes have the following meaning:

  • numberOfNodes represents the number of backends to pickup from the BackendName list to apply the policy (it must be set to 1 for RAIDb-0 load balancers and can never be greater than the number of nodes declared in the BackendName list).

  • policy works as follows:

    • random: numberOfNodes backends are picked up randomly from the BackendName list and the table is created on these nodes.

    • roundRobin: numberOfNodes backends are picked up from the BackendName list using a round-robin algorithm and the table is created on these nodes.

    • all: the table is created on all nodes in the BackendName list (numberOfNodes is ignored).

Here is an example of a RAIDb-0 controller with three nodes where new tables are created randomly on the first two nodes:

...
<DatabaseBackend name="node1" ...
<DatabaseBackend name="node2" ...
<DatabaseBackend name="node3" ...
...

<LoadBalancer>
  <RAIDb-0>
    <CreateTable policy="random" numberOfNodes="1">
      <BackendName name="node1" />
      <BackendName name="node2" />
    </CreateTable>
  </RAIDb-0>
</LoadBalancer>
	  
11.6.4.4. RAIDb-1:full mirroring load balancer

A RAIDb-1 load balancer is defined as follows:

<!ELEMENT RAIDb-1 (WaitForCompletion?, MacroHandling?, (RAIDb-1-RoundRobin | 
           RAIDb-1-WeightedRoundRobin | RAIDb-1-LeastPendingRequestsFirst))>

<!ELEMENT RAIDb-1-RoundRobin EMPTY>
<!ELEMENT RAIDb-1-WeightedRoundRobin (BackendWeight)>
<!ELEMENT RAIDb-1-LeastPendingRequestsFirst EMPTY>

<!ELEMENT WaitForCompletion EMPTY>
<!ATTLIST WaitForCompletion
    policy (first | majority | all) "first"
>

<!ELEMENT BackendWeight EMPTY>
<!ATTLIST BackendWeight
    name   CDATA #REQUIRED
    weight CDATA #REQUIRED
>
	  

If WaitForCompletion is omitted, the default behaviour is to return the result as soon as one backend has completed.

If MacroHandling is omitted, a default MacroHandling element is added.

The RAIDb-1 load balancer accepts a policy to specify distributed queries completion. Several load balancing policies are proposed:

  • RoundRobin: simple round-robin load balancing. The first request is sent to the first node, the second request to the second node, etc... Once a request has been sent to the last backend, the next request is sent to the first backend and so on.

  • WeightedRoundRobin: same as round-robin but a weight is associated to each backend. A backend that has a weight of 2 will get two times more requests than a backend with a backend with a weight of 1.

  • LeastPendingRequestsFirst: the request is sent to the backend that has the least pending requests to execute (that can be considered as the shortest pending request queue).

The definition of the RAIDb-1 element is as follows:

WaitForCompletion defines the policy to adopt when waiting for the completion of a request. Policy works as follows:

  • first: returns the result as soon as one node has completed.

  • majority: returns the result as soon as a majority of nodes (n/2+1) has completed.

  • all: waits for all nodes to complete before returning the result to the client.

11.6.4.5. RAIDb-1ec load balancer

The RAIDb-1 with error checking must provide an error checking policy (defined below). The optional WaitForCompletion policy only concern write requests (INSERT, DELETE, UPDATE, commit, ...).

[Note]Note

RAIDb-1ec is not operational in C-JDBC v1.0alpha.

The definition of the RAIDb-1ec element is as follows:

<!ELEMENT RAIDb-1ec (WaitForCompletion?, ErrorChecking, (RAIDb-1ec-RoundRobin | 
                     RAIDb-1ec-WeightedRoundRobin))>
<!ATTLIST RAIDb-1ec
    nbOfConcurrentReads CDATA #REQUIRED
>

<!ELEMENT RAIDb-1ec-RoundRobin EMPTY>
<!ELEMENT RAIDb-1ec-WeightedRoundRobin (BackendWeight)>

<!ELEMENT ErrorChecking EMPTY>
<!ATTLIST ErrorChecking 
    policy        (random | roundRobin | all) #REQUIRED
    numberOfNodes CDATA #REQUIRED
>
	  

Error checking policy (for RAIDb-1ec and RAIDb2-ec). Error checking is used to detect byzantine failures of nodes. It means detecting when a node sends funny results in a non-deterministic way. Error checking allows read queries to be sent to more than one database, and the results are compared. A majority of nodes must agree on the result that will be sent to the client. Error checking policies are defined as follows:

  • random: numberOfNodes backends are picked up randomly; the read request is sent to these backends and results are compared.

  • roundRobin: numberOfNodes backends are picked up using a round-robin algorithm ; the read request is send to these backends and results are compared.

  • all: the request is sent to all nodes (numberOfNodes is ignored) and the results compared.

numberOfNodes must be greater or equal to 3.

11.6.4.6. RAIDb-2 : distributed mirroring load balancer

The definition of the RAIDb-2 element is as follows:

<!ELEMENT RAIDb-2 (CreateTable*, WaitForCompletion?, MacroHandling?, (RAIDb-2-RoundRobin | 
 RAIDb-2-WeightedRoundRobin | RAIDb-2-LeastPendingRequestsFirst))>


<!ELEMENT RAIDb-2-RoundRobin EMPTY>
<!ELEMENT RAIDb-2-WeightedRoundRobin (BackendWeight)>
<!ELEMENT RAIDb-2-LeastPendingRequestsFirst EMPTY>
	  

If MacroHandling is omitted, a default MacroHandling element is added.

The RAIDb-2 load balancer accepts a policy to specify where new tables are created and how distributed queries completion should be handled. Several load balancing policies are proposed:

  • RoundRobin: simple round-robin load balancing. The first request is sent to the first node, the second request to the second node, etc... Once a request has been sent to the last backend, the next request is sent to the first backend and so on.

  • WeightedRoundRobin: same as round-robin but a weight is associated to each backend. A backend that has a weight of 2 will get two times more requests than a backend with a backend with a weight of 1.

  • LeastPendingRequestsFirst: the request is sent to the backend that has the least pending requests to execute (that can be considered as the shortest pending request queue).

The CreateTable element definition is defined in Section 11.6.4.3, “RAIDb-0 load balancer”.

The WaitForCompletion element definition is defined in Section 11.6.4.4, “RAIDb-1:full mirroring load balancer”.

11.6.4.7. RAIDb-2ec load balancer

The RAIDb-2 with error checking must provide an error checking policy as in RAIDb-1ec (see Section 11.6.4.5, “RAIDb-1ec load balancer”). The other elements are similar to those defined for RAIDb-2 controller (see Section 11.6.4.6, “RAIDb-2 : distributed mirroring load balancer”).

[Note]Note

RAIDb-2ec is not operational in C-JDBC v1.0alpha.

The definition of the RAIDb-2ec element is as follows:

<!ELEMENT RAIDb-2ec (CreateTable*, WaitForCompletion?, ErrorChecking,
                     (RAIDb-2ec-RoundRobin | RAIDb-2ec-WeightedRoundRobin))>
<!ATTLIST RAIDb-2ec 
    nbOfConcurrentReads CDATA #REQUIRED
>

<!ELEMENT RAIDb-2ec-RoundRobin EMPTY>
<!ELEMENT RAIDb-2ec-WeightedRoundRobin (BackendWeight)>
	  

11.6.5. Recovery Log

The C-JDBC Recovery Log stores write queries and transactions between logical checkpoints defined by the user. The log can be only be stored in a database (or cluster of databases) using a JDBCRecoveryLog element.

The definition of a RecoveryLog element is as follows:

<!ELEMENT RecoveryLog (JDBCRecoveryLog)>
	
11.6.5.1. JDBCRecoverylog

The JDBCRecoveryLog stores the recovery information in a database. To access this database, you must provide the driver class name to load (driver), an optional jar file or directory where to find the class to load (driverPath), the JDBC url to access the database as well as a valid login/password.

A timeout in seconds can be defined for the sql requests. If no value is given, the default timeout is set to 60 seconds. Warning! 0 means no timeout and wait forever until completion.

recoveryBatchSize is used to speedup the recovery process and allow several queries to be accumulated into a batch on the recovering backend. Increasing this value beyond a certain limit will not increase performance and will consume a significant amount of memory. Default is 10 and minimum is 1.

The recovery information is stored in 3 tables defined in the RecoveryLogTable, CheckpointTable and BackendLogTable elements.

The definition of a JDBCRecoveryLog element is as follows:

<!ELEMENT JDBCRecoveryLog (RecoveryLogTable, CheckpointTable, BackendTable)>

<!ATTLIST JDBCRecoveryLog 
    driver            CDATA #REQUIRED
    driverPath        CDATA #IMPLIED
    url               CDATA #REQUIRED
    login             CDATA #REQUIRED
    password          CDATA #REQUIRED
    requestTimeout    CDATA "60"
    recoveryBatchSize CDATA "10"
>
<!ELEMENT RecoveryLogTable EMPTY>
<!ATTLIST RecoveryLogTable 
    createTable              CDATA "CREATE TABLE"
    tableName                CDATA "logtable"  
    idColumnType             CDATA "BIGINT NOT NULL UNIQUE"
    vloginColumnType         CDATA "VARCHAR(20) NOT NULL"
    sqlColumnName            CDATA "sql"
    sqlColumnType            CDATA "TEXT NOT NULL"
    transactionIdColumnType  CDATA "BIGINT NOT NULL"
    extraStatementDefinition CDATA ""
>

<!ELEMENT CheckpointTable EMPTY>
<!ATTLIST CheckpointTable 
    createTable              CDATA "CREATE TABLE"
    tableName                CDATA "checkpointtable"
    checkpointNameColumnType CDATA "VARCHAR(127) NOT NULL"
    requestIdColumnType      CDATA "BIGINT"
    extraStatementDefinition CDATA ",PRIMARY KEY (name)"
>

<!ELEMENT BackendTable EMPTY>
<!ATTLIST BackendTable 
    createTable                  CDATA "CREATE TABLE"
    tableName                    CDATA "backendtable"  
    databaseNameColumnType       CDATA "VARCHAR(50) NOT NULL"
    backendNameColumnType        CDATA "VARCHAR(50) NOT NULL"
    backendStateColumnType       CDATA "INTEGER"
    checkpointNameColumnType     CDATA "VARCHAR(127) NOT NULL"  
    extraStatementDefinition     CDATA ""
>

<!ELEMENT DumpTable EMPTY>
<!ATTLIST DumpTable
    createTable              CDATA "CREATE TABLE"
    tableName                CDATA "dumptable"
    dumpNameColumnType       CDATA "TEXT NOT NULL"
    dumpDateColumnType       CDATA "TEXT NOT NULL"
    dumpPathColumnType       CDATA "TEXT NOT NULL"
    dumpFormatColumnType     CDATA "TEXT NOT NULL"
    checkpointNameColumnType CDATA "TEXT NOT NULL"  
    backendNameColumnType    CDATA "TEXT NOT NULL"
    tablesColumnType         CDATA "TEXT NOT NULL"
    extraStatementDefinition CDATA ""
>
	  

The JDBCRecoveryLog element requires the following attributes:

  • driver: the driver class name

  • url: the JDBC URL to access the database

  • login: the user login to connect to the database

  • password: the user password to connect to the database

  • requestTimeout: optional timeout request in second that will be used to replay the log queries. Default timeout is 60 seconds and 0 means no timeout (wait forever until a request complete).

  • recoveryBatchSize: used to speedup the recovery process and allow several queries to be accumulated into a batch on the recovering backend. Increasing this value beyond a certain limit will not increase performance and will consume a significant amount of memory. Default is 10 and minimum is 1.

The RecoveryLogTable defines how the JDBCRecoveryLog log table is created. The log table name (tableName) must conform to the syntax of a database table name. The log table stores a unique request id (id), the virtual login (vlogin)to use to execute the sql statement (sql) in the given transaction (transactionId). The statement used by the JDBCRecoveryLog to create the log table uses the RecoveryLogTable attributes as follows:

  CREATE TABLE tableName (
    id             idColumnType,
    vlogin         vloginColumnType,
    sql            sqlColumnType,
    transaction_id transactionIdColumnType
    extraStatementDefinition)
    	

If all default values are used, the log table is created using the following statement:

  CREATE TABLE logTable (
    id             INTEGER NOT NULL UNIQUE, 
    vlogin         VARCHAR (20) NOT NULL, 
    sql            TEXT NOT NULL, 
    transaction_id INTEGER NOT NULL
    )
    	

The CheckpointTable stores the checkpoint name and the corresponding index in the recovery log table. The statement used by the JDBCRecoveryLog to create the checkpoint table uses the CheckpointTable attributes as follows:

  CREATE TABLE tableName (
    name       checkpointNameColumnType,
    request_id requestIdColumnType
    extraStatementDefinition)
	  

If all default values are used, the log table is created using the following statement:

  CREATE TABLE checkpointTable (
    name       VARCHAR (20) NOT NULL,
    request_id INTEGER,
    PRIMARY KEY(name))
    	

The BackendLogTable stores the states of the different backends of a virtual database. It stores the name of the backend, the database it belongs to and the last known checkpoint of a backend when the backend is disabled, and the state the backend was in when the database was last shutdown. If all default values are used, the log table is created using the following statement:

  CREATE TABLE backendTable (
    databaseName VARCHAR(50) NOT NULL,
    backendName  VARCHAR(50) NOT NULL,
    backendState INTEGER,
    checkpointName VARCHAR(50) NOT NULL
    )
  

Here is an example on how to define a JDBCRecoveryLog to work with a HSQL database:

<RecoveryLog>
   <JDBCRecoveryLog driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost" login="sa" password="">
      <RecoveryLogTable 
         tableName="recovery"
         idColumnType="INTEGER NOT NULL" 
         sqlColumnType="VARCHAR NOT NULL" 
        extraStatementDefinition=",PRIMARY KEY (id)"/>
      <CheckpointTable tableName="checkpoint"/>
	  <BackendLogTable tableName="backendTable"/>
   </JDBCRecoveryLog>
</RecoveryLog>
          

The DumpTable stores the dump names and associated meta-data such as the corresponding checkpoint name. The statement used by the JDBCRecoveryLog to create the dump table uses the DumpTable attributes as follows:

  createTable tableName (
    dump_name       dumpNameColumnType,
    dump_date       dumpDateColumnType,
    dump_path       dumpPathColumnType,
    dump_format     dumpTypeColumnType,
    checkpoint_name checkpointNameColumnType,
    backend_name    backendNameColumnType,
    tables          tablesColumnType
    extraStatementDefinition)
            

dump_name is the dump logical name, dump_date the date at which the backup was started, dump_path the path where the dump can be found, dump_format an implementation specific text form that specifies the method used for the dump, checkpoint_name is the name of the checkpoint associated to this dump, tables is the list of tables that are contained in this dump (* means all tables). If all default values are used, the log table is created using the following statement:

  CREATE TABLE DumpTable (
    dump_name       TEXT NOT NULL,
    dump_date       DATE,
    dump_path       TEXT NOT NULL,
    dump_format     TEXT NOT NULL,
    checkpoint_name TEXT NOT NULL,
    backend_name    TEXT NOT NULL,
    tables          TEXT NOT NULL
    )            
          

11.7. SSL Configuration

SSL may be used for encryption as well as authentication for all connections to cjdbc.

SSL support for c-jdbc is based on the Java Secure Socket Extension (JSSE). JSSE has been integrated into the Java 2 SDK, Standard Edition, v 1.4. For java 1.3 it can be installed as an optional package. (available at http://java.sun.com/products/jsse/)

11.7.1. Controller

On the controller side ssl can be configured for all jmx connections and the virtual database accessed via the cjdbc driver with the xml element SSL in the controller configuration :

              <!ELEMENT SSL EMPTY>
              <!ATTLIST SSL
                  keyStore              CDATA        #IMPLIED
                  keyStorePassword      CDATA        #IMPLIED
                  keyStoreKeyPassword   CDATA        #IMPLIED
                  isClientAuthNeeded    (true|false) "false"
                  trustStore            CDATA        #IMPLIED
                  trustStorePassword    CDATA        #IMPLIED
              >
            
  • keyStore: The file where the keys are stored

  • keyStorePassword: the password to the keyStore

  • keyStoreKeyPassword: the password to the private key, if none is specified the same password as for the store is used

  • isClientAuthNeeded: if set to false ssl is used for encryption only, it true set to true then the server is only accepting trusted clients (the client certificate has to be in the trusted store)

  • trustStore: the file where the trusted certificates are stored, if none is specified the same store as for the key is used

  • trustStorePassword: the password to the trustStore, if none is specified the same password as for the keyStore is used

11.7.2. Console / Jmx Clients

The console and other jmx clients are configured with the use of java properties :

  • javax.net.ssl.keyStore

  • javax.net.ssl.keyStorePassword

  • javax.net.ssl.trustStore

  • javax.net.ssl.trustStorePassword

Example : -Djavax.net.ssl.trustStore=client.keystore -Djavax.net.ssl.trustStorePassword=clientpassword

11.7.3. Driver

SSL on the driver side is configured with java properties

  • cjdbc.ssl.enabled=true

  • javax.net.ssl.keyStore

  • javax.net.ssl.keyStorePassword

  • javax.net.ssl.trustStore

  • javax.net.ssl.trustStorePassword

Example : -Djavax.net.ssl.trustStore=client.keystore -Djavax.net.ssl.trustStorePassword=clientpassword

11.7.4. Certificates (public and private keys)

You may create your certificates with the keytool (part of jsse)

  1. Create a self-signed server and a self-signed client key each in its own keystore

    $> keytool -genkey -v -keyalg RSA -keystore server.keystore -dname "CN=Server, OU=Bar, O=Foo, L=Some, ST=Where, C=UN"
    $> keytool -genkey -v -keyalg RSA -keystore client.keystore -dname "CN=Client, OU=Bar, O=Foo, L=Some, ST=Where, C=UN"
                       
  2. Export the server's and the client's public keys from their respective keystores

    $> keytool -export -rfc -keystore server.keystore -alias mykey -file server.public-key
    $> keytool -export -rfc -keystore client.keystore -alias mykey -file client.public-key
                       
  3. Import the client's public key to the server's keystore, and vice-versa:

    $> keytool -import -alias client -keystore server.keystore -file client.public-key
    $> keytool -import -alias server -keystore client.keystore -file server.public-key
                       

11.8. Configuration Examples

Configuration files examples are available in the C-JDBC distribution in the /c-jdbc/doc/examples directory.

Here is a brief overview of each example content:

  • Cache: gives various configuration examples on how to use the cache.

  • Derby : contains examples for the Apache Derby database including the ones used in the ApacheCon demos.

  • HorizontalScalbility : provides configuration files to create a distributed virtual database on 2 controllers. One file should be loaded on each of the two controllers.

  • LinuxService and SuSE : contains examples to run C-JDBC controller as a Linux service.

  • SingleDB: a C-JDBC configuration with a unique MySQL backend.

  • RAIDb-0: C-JDBC configuration examples for RAIDb-0.

    • RAIDb-0.xml: a simple 2 nodes RAIDb-0 configuration.

    • RAIDb-0-schema.xml: a 2 nodes RAIDb-0 configuration using a static database schema definition matching the RUBiS benchmark database schema.

  • RAIDb-1: contains various RAIDb-1 configuration examples.

  • RAIDb-2: contains various RAIDb-2 configuration examples.

  • RecoveryLog: gives an example of a fault tolerant recovery log.



[3] First In First Out.