Overview
Features
Download
Documentation
Community
Add-Ons & Services
The POCO C++ Libraries Blog

The latest on Data

Filed under: Uncategorized by alex at 03:54

The latest is null support. Not being able to insert/update and check for nulls was an important shortcoming, so I have pushed it ahead of Unicode.

Now, the following works:

ses << "CREATE TABLE NullTest (i INTEGER)", now;
ses << "INSERT INTO NullTest VALUES(:i)", use(null), now;
RecordSet rs(ses, "SELECT * FROM NullTest");
rs.moveFirst();
assert (rs.isNull("i"));
assert (rs["i"] == 0);

Update: the commented portion in italics does not apply anymore (see the reply below)

/*

SQLite is a bit more gracious than ODBC and will accept generic null value. For ODBC, the difference is that when inserting/updating null fileds, data type has to be specified, so the above code looks like this:

ses << "INSERT INTO NullTest VALUES(?)", use(NULL_INT32), now;

The rest is the same. ODBC code will work for SQLite (which will gracefully take any of NullData enum values), while the other way around is not true – ODBC will throw up when fed the generic one.

*/

Code is in SVN, tested on windows with all supported drivers and Linux with PostgreSQL. Comments and bug reports are welcome.

8 Comments »
  1. Does ‘Linux with PostgreSQL’ imply at all that a native Postgres driver is in the works?

    One thing I’d be very interested in is whether there is any support for INSERT statements that have multiple rows of values.

    Comment by James Mansion on June 21, 2007, 12:46

  2. Things always look better after a good night sleep and early morning run :-).
    So, I retract the above statement about ODBC and generic null values. The simple ‘use(null)’ syntax can be used with ODBC. All the supported ODBC drivers seem to be happy with specifying char as type, regardless of real underlying field type.
    Since I’ve seen ODBC drivers going overboard to adhere to the specification (or the author’s interpretation thereof), I won’t remove the type-specific null enum values and related code. Just in case.

    Comment by alex on June 21, 2007, 13:47

  3. James,

    Linux with PostgreSQL means PostgreSQL ODBC driver on Linux with unixODBC – that’s what I have fully tested so far. PostgreSQL is currently the only one that passes all the tests on Linux.

    As a PostgreSQL fan, I would very much like to see native PostgreSQL connector in Data, but the usual fuss about lack of time applies. A contribution there would be very much appreciated, so anyone willing to help, email me at(aleskx, dot(gmail, com)). I’ll setup the project framework and provide development support, you supply the code.

    As for inserting multiple values, yes we do support that for pretty much all the STL containers. Here’s an excerpt from the tests:

    std::vector<std::string> lastNames;
    std::vector<std::string> firstNames;
    std::vector<std::string> addresses;
    std::vector<int> ages;
    lastNames.push_back("LN1");
    lastNames.push_back("LN2");
    firstNames.push_back("FN1");
    firstNames.push_back("FN2");
    addresses.push_back("ADDR1");
    addresses.push_back("ADDR2");
    ages.push_back(1);
    ages.push_back(2);

    session < < "INSERT INTO PERSON VALUES (?,?,?,?)", use(lastNames), use(firstNames), use(addresses), use(ages), now;

    You can do the same with a container of Tuples:

    typedef Tuple<int,int,int> TupleType;
    TupleType t(0,1,2);
    TupleType t10(10,11,12);
    TupleType t100(100,101,102);
    std::vector<TupleType> v;
    v.push_back(t);
    v.push_back(t10);
    v.push_back(t100);
    session < < "INSERT INTO Tuples VALUES (?,?,?)", use(v), now;

    In the above code, std::list or std::queue can be readily used instead of std::vector.

    Update: yes, std::set, too.

    Comment by alex on June 21, 2007, 14:11

  4. Oh. Well, with that second issue, does it translate into:

    INSERT INTO … VALUES (…)
    INSERT INTO … VALUES (…)
    INSERT INTO … VALUES (…)
    and so on

    or

    INSERT INTO … VALUES (…), (…), (…)

    My experience so far has been that the latter can be a great deal faster.

    I suppose I’ll have to look at Data eventually ;-).

    Issues like this and support for multiple result sets from an SQL batch or stored proc invocation are somewhat important to me though. My experience as a Sybase developer has been that latency through round-trip is a major cause of application slowness.

    James

    Comment by James Mansion on June 22, 2007, 12:39

  5. I am aware of bulk operations and multiple resultsets performance advantages. Unfortunately, at this time, those are not supported. I will definitely look into it for 1.5.

    Comment by alex on June 22, 2007, 13:16

  6. > I am aware of bulk operations

    Can we maybe find a better terminology here? certainly from a SOCI point of view it would seem to equate bulk with ‘prepared’, and I think there’s a taxonomy of performance enhancing approaches:

    plain parse-and-go single row
    prepared single row
    multi-row single statement (eg he pgsql example above)
    batched set of statements
    bulk insert

    To me a bulk operation is the last of these and is exemplified by a Sybase BCP or PostgreSQL COPY operation: it may or may not be logged or transacted, and has limited flexibility, but is the fastest mechanism available.

    The SQL insert statement is quite flexible in terms of specifying the column set or sometimes even inserting to a view, so I think its probably best managed seperately by the creation of an object that contains table metadata, a special bulk connection, and then an explicit bulk insert against the connection which references the meta data object where all attributes must be specified, with an STL iterator pair defining the source.

    The bulk insert facilities seem to vary more widely than other bits of CLI API, and in particular may not work against a standard database connection.

    Comment by James Mansion on June 26, 2007, 17:02

  7. James,

    you are right, there is a finer clasification that simple calling it all bulk. I’ll make a note of this for future development. Also, help from folks like you is very valuable because we are trying to put everything behind a common interface and sometimes it is hard to predict what kind of constraints will a new connector impose.
    More on this in the next development cycle for Data.

    PLOP! (that’s the sound of me jumping in the Adriatic sea to cool myself down ;-)

    Alex

    Comment by alex on June 28, 2007, 10:48

  8. Cool!

    BTW I suggest looking at libpqxx table stream for the bulk insert, but being careful with any expectation that the operation will be transacted or can be performed on a normal connection.

    It would be nice if databases that offer it (PostgreSQL, Firebird afaik) can provide async alerts too. This will need careful design since some file descriptor monitoring may be necessary, or polling. This may be another case where its worth requiring a seperate connection as a conservative approach to avoid problems on other systems.

    Have a great holiday, anyway.

    Comment by James Mansion on June 28, 2007, 17:35

RSS RSS feed for comments on this post. TrackBack URI

Leave a comment